Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFBikZ5OW06mPtrQ2NNsNzbTAXSg-9FSYy_Qbz+3RMAR25Ncag@mail.gmail.com>
Date: Wed, 19 Oct 2022 11:44:45 +0800
From: Tom Shen <sjiagc.dev@...il.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: gethostbyname2_r returns invalid IPv6 address if DNS
 server replies IPv4 address

Thanks for replying!

> gethostbyname2 is legacy and has lots of inherent problems, so this
> really should be changed to use getaddrinfo. If nothing else, such a
> change would make it finish twice as fast.

Ya, it is obsolete as stated in Linux doc. But a third-party service is
using "getent hosts" which calls this API.

> Are you saying that the configured nameserver (coredns) is responding
>   with an answer packet for a different question (A instead of AAAA)
>   than the query packet asked for? I'm confused why it would do that.

>   It sounds like this is also wrong behavior by coredns. If it's
>   rewriting AAAA queries to A for the sake of determining if the name
>   exists vs NxDomain, that's fine for the outgoing query it performs,
>   but instead of returning the answer to the rewritten query back to the
>   asker as an answer that mismatches the question, it should just be
>   removing the answer section to make it into a NODATA answer (if it's
>   not already NxDomain).

Yes. In CoreDNS, we can configure it to rewrite the request type AAAA to A,
so that the response will be IPv4 family and address. This is a CoreDNS
feature. I cannot find anything in the DNS standards against it so far.

>   The expected behavior of a process receiving such an answer would be
>   to ignore it as bogus, but maybe we're not doing that, and then
>   wrongly returning answers that don't match the type we asked for?

Unfortunately, we return invalid IPv6 address as stated in the previous
email. But, maybe it's not good to mention it here, glibc can handle this
situation.

> I'd rather fix this at the layer the actual bug (missing validation)
> is at.

So, let me try to suggest a fix in __lookup_name or musl team would help to
take a look? So far I cannot find a way to create a pull request to musl
project.

On Wed, Oct 19, 2022 at 1:27 AM Rich Felker <dalias@...c.org> wrote:

> On Wed, Oct 19, 2022 at 12:02:36AM +0800, Tom Shen wrote:
> > Hi Musl team,
> >
> > Recently we encountered an issue in requesting name resolution in Alpine
> > Linux with Musl.
> >
> > Environment:
> > * Alpine Linux 3.14.2
> > * IPv6 module disabled
> > * musl-libc 1.2.2
> > * CoreDNS with "rewrite stop type AAAA A" configured
> >
> > Reproduce steps:
> > 1. getent hosts <any-name-only-has-ipv4-record>
> > 2. Check output
> >
> > Expected: Get correct IPv4 address, e.g. 10.96.36.74
> >
> > Actual: Return an invalid IPv6 address, e.g. a60:244a::, which is the
> > actual IPv4 32-bit address followed by zeros.
> >
> > Debugging:
> >
> > 1. Alpine Linux's getent (
> >
> https://github.com/alpinelinux/aports/blob/3.14-stable/main/musl/getent.c)
> > tries gethostbyname2 with AF_INET6 first, if no result, then calls with
> > AF_INET.
>
> gethostbyname2 is legacy and has lots of inherent problems, so this
> really should be changed to use getaddrinfo. If nothing else, such a
> change would make it finish twice as fast.
>
> > 2. With requested family AF_INET6, gethostbyname2_r -> __lookup_name
> > -> name_from_dns_search sends a DNS request with RR_AAAA to CoreDNS.
> > 3. In CoreDNS, we configured "rewrite stop type AAAA A", which results in
> > the RR_AAAA request type being rewritten to RR_A. So the response from
> > CoreDNS is: AF_INET (2) and 32-bit IP 10.96.36.74 (0x0a60244a below).
> >
> > > (gdb) p addrs
> > > $55 = {{family = 2, scopeid = 0, addr = "\n`$J", '\000' <repeats 11
> > > times>, sortkey = 0}
> > > (gdb) x/4xb addrs[0]->addr
> > > 0x7fffffffe0d8: 0x0a    0x60    0x24    0x4a
>
> Are you saying that the configured nameserver (coredns) is responding
> with an answer packet for a different question (A instead of AAAA)
> than the query packet asked for? I'm confused why it would do that.
> The expected behavior of a process receiving such an answer would be
> to ignore it as bogus, but maybe we're not doing that, and then
> wrongly returning answers that don't match the type we asked for?
>
> It sounds like this is also wrong behavior by coredns. If it's
> rewriting AAAA queries to A for the sake of determining if the name
> exists vs NxDomain, that's fine for the outgoing query it performs,
> but instead of returning the answer to the rewritten query back to the
> asker as an answer that mismatches the question, it should just be
> removing the answer section to make it into a NODATA answer (if it's
> not already NxDomain).
>
> > 4. In gethostbyname2_r, the result address family is assigned to the
> > request's family, regardless of the family in the response.
> >
> > > h->h_addrtype = af;
> > > h->h_length = af==AF_INET6 ? 16 : 4;
> > >
> >  This results in the IPv4 address replied from CoreDNS is wrongly copied
> to
> > the result as IPv6 address which is a60:244a:: .
>
> OK, it looks like gethostbyname2_r is assuming the results match the
> address type it asked for, which is mostly correct (except AF_UNSPEC?
> that doesn't seem to be handled and it's probably invalid but we
> should probably error out on it or something); it's __lookup_name
> which is wrong to return results of the wrong type (as a consequence
> of not validating the DNS responses).
>
> > 5. getent gets an IPv4 address, so it won't try AF_INET, rather directly
> > return the invalid IPv6 address a60:244a:: .
> >
> > Suggested fix:
> > In gethostbyname2_r when adding answered addresses into the result, we
> need
> > to filter out the addresses with mismatching address family.
> >
> > > for (i=0; i<cnt; i++) {
> > >     // if (addrs[i].family != h->h_addrtype) continue; // need to fix
> the
> > > index of h->h_addr_list too
> > >     h->h_addr_list[i] = (void *)buf;
> > >     buf += h->h_length;
> > >     memcpy(h->h_addr_list[i], addrs[i].addr, h->h_length);
> > > }
> > >
> >
> > Could you review this issue? Thanks in advance!
>
> I'd rather fix this at the layer the actual bug (missing validation)
> is at.
>
> Rich
>

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.