|
Message-ID: <CAFrh3J_656U9NXc-=RTdbtnmQAJ5ZfLZXQgnwjC=u74oqDbrmA@mail.gmail.com>
Date: Mon, 7 Feb 2022 14:19:05 -0500
From: Satadru Pramanik <satadru@...il.com>
To: Rich Felker <dalias@...ifal.cx>
Cc: musl@...ts.openwall.com
Subject: Re: Re: musl getaddr info breakage on older kernels
The test programs are being run from...
glibc 2.23 -> bash (crosh shell)
crosh shell -> invokes ruby -> invokes bash to run the test programs.
tcpdump on the router shows no network activity at all when running
the test program with tcpdump -i any -vvv host (IP address)
When I run the test pogram with strace though I see this:
14:06:24.617860 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP
(17), length 56)
192.168.0.121.46846 > office.lan.53: [udp sum ok] 16051+ A? google.com.
(28)
14:06:24.622352 IP (tos 0x0, ttl 64, id 15884, offset 0, flags [DF], proto
UDP (17), length 72)
office.lan.53 > 192.168.0.121.46846: [bad udp cksum 0x8210 -> 0x7bc1!]
16051 q: A? google.com. 1/0/0 google.com. [1m32s] A 142.251.40.110 (44)
14:06:24.688610 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP
(17), length 56)
192.168.0.121.42267 > office.lan.53: [udp sum ok] 35406+ A? google.com.
(28)
14:06:24.688931 IP (tos 0x0, ttl 64, id 15887, offset 0, flags [DF], proto
UDP (17), length 72)
office.lan.53 > 192.168.0.121.42267: [bad udp cksum 0x8210 -> 0x4209!]
35406 q: A? google.com. 1/0/0 google.com. [1m32s] A 142.251.40.110 (44)
14:06:24.689018 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP
(17), length 56)
192.168.0.121.42267 > office.lan.53: [udp sum ok] 13657+ AAAA?
google.com. (28)
14:06:24.689186 IP (tos 0x0, ttl 64, id 15888, offset 0, flags [DF], proto
UDP (17), length 84)
office.lan.53 > 192.168.0.121.42267: [bad udp cksum 0x821c -> 0xc77e!]
13657 q: AAAA? google.com. 1/0/0 google.com. [20s] AAAA
2607:f8b0:4006:80b::200e (56)
On Sun, Feb 6, 2022 at 9:40 PM Rich Felker <dalias@...ifal.cx> wrote:
> On Sun, Feb 06, 2022 at 08:29:16PM -0500, Satadru Pramanik wrote:
> > Here are illustrative logs of output and strace logs.
> >
> > Note that while the musl toolchain is built in a container on a much more
> > powerful machine, this "musl_getaddrinfo_test" app is built locally on
> the
> > machine with the 3.8 kernel.
> >
> > I ran the following to get the output on the smaller i686 machine
> > immediately after the app is built.
> > Apologies for the ruby code wrapping the shell commands.
> >
> > @musl_ver = `#{CREW_MUSL_PREFIX}/lib/libc.so 2>&1 >/dev/null | head
> -2
> > | tail -1 | awk '{print $2}'`.chomp
> > puts 'Testing the musl resolver to see if it can resolve google.com:
> > '.lightblue
> > system "./musl_getaddrinfo_test google.com set_ai_family 2>&1 |tee
> -a
> > /tmp/musl_#{@...l_ver}_getaddrinfo_test_google.com_set_ai_family.txt "
> > system "./musl_getaddrinfo_test google.com 2>&1 |tee -a
> > /tmp/musl_#{@...l_ver}_getaddrinfo_test_google.com.txt"
> > system "strace -o
> >
> /tmp/musl_#{@...l_ver}_getaddrinfo_test_google.com_set_ai_family_STRACE.txt
> > ../musl_getaddrinfo_test google.com set_ai_family"
> > system "strace -o
> > /tmp/musl_#{@...l_ver}_getaddrinfo_test_google.com_STRACE.txt
> > ../musl_getaddrinfo_test google.com"
> >
> > And here is the output for each run before running again via strace. Note
> > how IPv6 addresses show up sporadically, and for 1.2.2 nothing at all
> shows
> > up, but everything works fine according to the strace logs. (Strace is
> > built against glibc 2.23.)
> >
> > ==>
> > musl_1.2.0-git-17-g33338ebc_getaddrinfo_test_google.com_set_ai_family.txt
> > <==
> > AF_INET: 142.251.40.110
> >
> > ==> musl_1.2.0-git-17-g33338ebc_getaddrinfo_test_google.com.txt <==
> > AF_INET: 142.251.40.110
> >
> > ==>
> > musl_1.2.0-git-39-g5cf1ac24_getaddrinfo_test_google.com_set_ai_family.txt
> > <==
> > AF_INET: 142.251.40.142
> >
> > ==> musl_1.2.0-git-39-g5cf1ac24_getaddrinfo_test_google.com.txt <==
> > getaddrinfo: Try again
> >
> > ==>
> > musl_1.2.0-git-40-g1b4e84c5_getaddrinfo_test_google.com_set_ai_family.txt
> > <==
> > AF_INET: 142.251.40.206
> >
> > ==> musl_1.2.0-git-40-g1b4e84c5_getaddrinfo_test_google.com.txt <==
> > AF_INET6: 2607:f8b0:4006:81f::200e
> > AF_INET: 142.251.40.206
> >
> > ==>
> > musl_1.2.0-git-6-g2f2348c9_getaddrinfo_test_google.com_set_ai_family.txt
> <==
> > AF_INET: 142.250.65.206
> >
> > ==> musl_1.2.0-git-6-g2f2348c9_getaddrinfo_test_google.com.txt <==
> > AF_INET: 142.250.65.206
> >
> > ==> musl_1.2.1_getaddrinfo_test_google.com_set_ai_family.txt <==
> > AF_INET: 142.251.40.110
> >
> > ==> musl_1.2.1_getaddrinfo_test_google.com.txt <==
> > getaddrinfo: Try again
> >
> > ==> musl_1.2.2_getaddrinfo_test_google.com_set_ai_family.txt <==
> > getaddrinfo: Try again
> >
> > ==> musl_1.2.2_getaddrinfo_test_google.com.txt <==
> > getaddrinfo: Try again
> >
> > Regards,
>
> OK, I don't see anything in the strace suggesting a cause. The kernel
> version (or whether a container was used) present on the system where
> you built musl or the test programs should make no difference
> whatsoever; musl has no build dependencies on the host kernel or
> kernel headers or anything like that (and doesn't even need to be
> built on a Linux host).
>
> A couple questions:
>
> Are the test programs on the i686 machine running under Docker or any
> other container environment?
>
> Can you tcpdump the traffic between the test program and the dnsmasq
> during a failing run, with verbose display of the packet contents
> (-vvv or something like that)?
>
> I don't see any plausible explanation for the result varying between
> runs and with timing like this unless dnsmasq is doing something
> odd/wrong. I thought it might be related to something blocking time64
> syscalls but that doesn't seem to be the case -- according to the
> strace logs they're getting ENOSYS as expected with fallback to the
> legacy 32-bit clock_gettime etc. which is fine.
>
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.