|
Message-ID: <20220625015655.GR7074@brightrain.aerifal.cx> Date: Fri, 24 Jun 2022 21:56:56 -0400 From: Rich Felker <dalias@...c.org> To: Markus Geiger <markus.geiger@...lsen.com> Cc: musl@...ts.openwall.com Subject: Re: [BUG] Non-FQDN domain resolving failure on musl-1.2.x On Fri, Jun 24, 2022 at 07:14:10PM +0200, Markus Geiger wrote: > Sorry: not Amazon DNS – 10.204.109.209 is a BIND server in our network > we've setup to work with our global VPN/DNS. > > BUT the strange thing is that the domain lookup works with musl-1.1.24 > while with some musl-1.2.x just quits with an error. > > a comparison with the docker runs and `sudo tcpdump -v -i docker0 udp port > 53 or tcp port 53` did not bring up any diffs except the list of A records > returned is in a different order (which i think is completely normal). the > order of requests is the same > > tcpdump from working version: > > bind-us-east-1a.XXXXXXXXXXXXXX.domain > 172.17.0.3.45501: 18685 9/13/8 > slack.com. A 3.95.117.96, slack.com. A 34.231.24.224, slack.com. A > 54.163.235.119, slack.com. A 54.147.59.169, slack.com. A 34.193.255.5, > slack.com. A 34.204.109.226, slack.com. A 34.225.62.185, slack.com. A > 34.203.97.10, slack.com. A 54.92.199.186 (510) > > tcpdump from non-working version: > > bind-us-east-1a.XXXXXXXXXXXXXX.domain > 172.17.0.3.59951: 49211 9/13/8 > slack.com. A 34.225.62.185, slack.com. A 54.163.235.119, slack.com. A > 34.231.24.224, slack.com. A 54.147.59.169, slack.com. A 34.193.255.5, > slack.com. A 34.204.109.226, slack.com. A 54.92.199.186, slack.com. A > 3.95.117.96, slack.com. A 34.203.97.10 (510) > > Complete log: > > 172.17.0.3.59951 > bind-us-east-1a.XXXXXXXXXXXXXXXXXXXXXXXXXx.domain: > 49211+ A? slack.com. (27) > 18:56:19.990087 IP (tos 0x0, ttl 64, id 10210, offset 0, flags [DF], proto > UDP (17), length 55) > 172.17.0.3.59951 > bind-us-east-1a.XXXXXXXXXXXXXXXXXXXXXXXXXx.domain: > 49334+ AAAA? slack.com. (27) > 18:56:20.154990 IP (tos 0x0, ttl 250, id 17825, offset 0, flags [none], > proto UDP (17), length 538) > bind-us-east-1a.XXXXXXXXXXXXXXXXXXXXXXXXXx.domain > 172.17.0.3.59951: > 49211 9/13/8 slack.com. A 34.225.62.185, slack.com. A 54.163.235.119, > slack.com. A 34.231.24.224, slack.com. A 54.147.59.169, slack.com. A > 34.193.255.5, slack.com. A 34.204.109.226, slack.com. A 54.92.199.186, > slack.com. A 3.95.117.96, slack.com. A 34.203.97.10 (510) > 18:56:20.241377 IP (tos 0x0, ttl 250, id 17846, offset 0, flags [none], > proto UDP (17), length 55) > bind-us-east-1a.XXXXXXXXXXXXXXXXXXXXXXXXXx.domain > 172.17.0.3.59951: > 49334 ServFail 0/0/0 (27) > 18:56:20.241501 IP (tos 0x0, ttl 64, id 10233, offset 0, flags [DF], proto > UDP (17), length 55) Here's your problem -- the server is returning ServFail rather than an answer for some of the queries. This makes musl's resolver continue retrying for an answer. In an old version, there may have been a bug whereby, after the retries timed out, the fact that one query failed was sometimes overlooked. This logic was improved between the versions you tested as part of ensuring DNSSEC integrity. In any case, you just need to find the cause of the ServFail (maybe a hack someone put in place to try to suppress use of IPv6?) and fix it. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.