Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <202210061500.296F0HQT013548@mail.karels.net>
Date: Thu, 06 Oct 2022 10:00:17 -0500
From: Mike Karels <mike@...els.net>
To: libc-coord@...ts.openwall.com
Subject: Re: EAI_NOADDR ?

On Thu, 6 Oct 2022, Rich Felker wrote:
> On Wed, Oct 05, 2022 at 09:05:49AM -0500, Mike Karels wrote:
> > On Tue, 04 Oct 2022, Rich Felker wrote:
> > > On Mon, Oct 03, 2022 at 12:47:21PM -0500, Mike Karels wrote:
> > > > Replying to my own message:
> > > > 
> > > > > On Wed, Sep 28, Rich Felker wrote:
> > > > > > On Wed, Sep 28, 2022 at 02:19:23PM -0500, Mike Karels wrote:
> > > > > > > Hi, I am Mike Karels, a FreeBSD committer.  Coincidentally, I sent a
> > > > > > > message on this subject to a FreeBSD list yesterday; you can see it at
> > > > > > > https://lists.freebsd.org/archives/freebsd-net/2022-September/002461..html.
> > > > > > > Kostik pointed me at this thread.
> > > > > > > 
> > > > > > > On Wed, Sep 28 Hajimu UMEMOTO <ume@...eBSD.org> wrote:
> > > > > > > > Hi,
> > > > > > > 
> > > > > > > > >>>>> On Tue, 27 Sep 2022 23:36:20 +0300
> > > > > > > > >>>>> Konstantin Belousov <kostikbel@...il.com> said:
> > > > > > > 
> > > > > > > > kostikbel> On Tue, Sep 20, 2022 at 03:29:35PM -0400, Rich Felker wrote:
> > > > > > > > > On Tue, Sep 20, 2022 at 11:39:55AM +0300, Konstantin Belousov wrote:
> > > > > > > > > > On Tue, Sep 20, 2022 at 10:28:16AM +0200, Florian Weimer wrote:
> > > > > > > > > > > * Rich Felker:
> > > > > > > > > > > 
> > > > > > > > > > > > On Mon, Sep 19, 2022 at 10:57:55PM +0200, Florian Weimer wrote:
> > > > > > > > > > > >> * Rich Felker:
> > > > > > > > > > > >> 
> > > > > > > > > > > >> > One problem I've seen come up again and again with libc stub resolver
> > > > > > > > > > > >> > API is that there's no way to distinguish between NxDomain and NODATA
> > > > > > > > > > > >> > responses from DNS. These have very different meanings ("name doesn't
> > > > > > > > > > > >> > exist" vs "name exists but has no address (or whatever record type you
> > > > > > > > > > > >> > were looking for") and being able to distinguish them is important for
> > > > > > > > > > > >> > implementing containerized-type DNS service on top of the host's
> > > > > > > > > > > >> > resolver API rather than direct proxying to outside DNS (when the
> > > > > > > > > > > >> > latter isn't desirable).
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > POSIX defines EAI_NONAME as:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > [EAI_NONAME]
> > > > > > > > > > > >> >     The name does not resolve for the supplied parameters. 
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > which, under generous interpretation of "parameters", seems to cover
> > > > > > > > > > > >> > both cases, although arguably it does "resolve" to just an empty list
> > > > > > > > > > > >> > of addresses in the NODATA case.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > To address this, I'm considering proposing a new error code EAI_NOADDR
> > > > > > > > > > > >> > that would be defined something like:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > [EAI_NOADDR]
> > > > > > > > > > > >> >     The name does not have any addresses for the supplied parameters.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Would other implementators be on-board with such a proposal?
> > > > > > > > > > > >> 
> > > > > > > > > > > >> I think several libcs implemented this as EAI_NODATA already.  I see it
> > > > > > > > > > > >> documented for AIX, glibc, NetBSD, OpenBSD, QNX, Solaris.  Apparently,
> > > > > > > > > > > >> it's absent from FreeBSD (and Windows).
> > > > > > > 
> > > > > > > FreeBSD has EAI_NOADDR and EAI_ADDRFAMILY defined inside #if 0 in the
> > > > > > > header, but still included in the error strings.  EAI_NOADDR is "No
> > > > > > > address associated with hostname", and EAI_ADDRFAMILY is "Address
> > > > > > > family for hostname not supported."  Based on these strings, I proposed
> > > > > > > EAI_ADDRFAMILY for the case where the name was valid but had no
> > > > > > > address for the address family, as opposed to "No address associated
> > > > > > > with hostname" (which implies that there are no addresses at all).
> > > > 
> > > > > > Distinguishing EAI_ADDRFAMILY vs EAI_NOADDR like this requires
> > > > > > querying both A and AAAA even if the caller only requested one, which
> > > > > > users would probably not be happy with as an added cost.
> > > > 
> > > > > The BSD/FreeBSD resolver code distinguishes between NXDOMAIN (name
> > > > > doesn't resolve), and zero answers of the type requested.  The latter
> > > > > might mean that there are addresses of other types, or records such
> > > > > as NS, MX or others.  That means the name is valid.  fwiw, I have a
> > > > > prototype of getaddrinfo() distinguishing the two by simply shuffling
> > > > > error returns.  It returns EAI_NONAME if there is an NXDOMAIN error,
> > > > > or EAI_ADDRFAMILY if there is no address.  As noted earlier, FreeBSD
> > > > > does not currently use EAI_NODATA (or EAI_ADDRFAMILY).
> > > > 
> > > > > > > fwiw, NetBSD and OpenBSD seem to use EAI_NOADDR, or at least that
> > > > > > > error string, for both "name invalid" and "no address of requested
> > > > > > > family".
> > > > 
> > > > Oops, that's EAI_NODATA (No address associated with hostname).
> > > > 
> > > > > > This is what we're leaning toward in musl for the reason above.
> > > > 
> > > > Just to be sure: you mean using EAI_NODATA, or a new EAI_NOADDR?
> > 
> > > Yes, I meant EAI_NODATA. EAI_NOADDR was the proposed name I introduced
> > > it as in this thread, not remembering it was a thing some
> > > implementations already had under the name EAI_NODATA, just not in the
> > > standard.
> > 
> > I'm torn between EAI_ADDRFAMILY (which has a better current error message
> > in FreeBSD) and EAI_NODATA.  I could change the error message for EAI_NODATA,
> > but then it will sound close to EAI_ADDRFAMILY.  Changing the English error
> > message is easy, but we have several translations as well.
> > 
> > Any other opinions on the best choice?  I suppose glibc is unlikely to
> > change.

> Per the EAI names, I prefer EAI_NODATA. It corresponds directly to the
> familar DNS condition and can reasonably mean "name exists but doesn't
> have an address in any of the families you requested"; this is just a
> special case of not having any addresses at all.

I've been looking more at the error strings, which is what the user sees.
EAI_NODATA is "No address associated with hostname" in both glibc and
FreeBSD, which implies to me that there is no address in any family.
That is over-generalized from this situation.  The description in the
Linux getaddrinfo(3) is even more definite:

       EAI_NODATA
              The specified network host exists, but does not have any network
              addresses defined.

That is why I prefer EAI_ADDRFAMILY, and don't understand why glibc
uses EAI_NODATA.  Yes, the error string could be changed, but then it
seems to have essentially the same meaning as EAI_ADDRFAMILY.

> On the other hand, EAI_ADDRFAMILY comes across as implying
> affirmatively that there *is* an address in at least one family, just
> not the one(s) you requested. I would lean towards saying that it's
> wrong for getaddrinfo to fail with EAI_ADDRFAMILY when AF_UNSPEC was
> requested.

I don't see that EAI_ADDRFAMILY implies that there is an address in another
family.  But EAI_NODATA still seems acceptable for AF_UNSPEC, if that does
both A and AAAA queries (or "any" and follows CNAMEs).

> So I think if we're going with just one of the two errors (not doing
> the spurious queries to disambiguate them), EAI_NODATA is the
> preferred choice.

Anyone else have an opinion on this?

> > 
> > > > > > Alternatively, I suppose EAI_ADDRFAMILY could be used for both cases
> > > > > > (all NODATA responses), but that seems less intuitive and less inline
> > > > > > with current practices on existing systems that have one or both of
> > > > > > these error codes.x
> > > > 
> > > > > > > > > > > > Oh, perfect! In that case, can we push this for standardization?
> > > > > > > > > > > 
> > > > > > > > > > > I think a separate error code makes sense.
> > > > > > > > > > > 
> > > > > > > > > > > > And, it looks like glibc also defines EAI_ADDRFAMILY with somewhat
> > > > > > > > > > > > overlapping meaning. Is there good documentation for how they're
> > > > > > > > > > > > distinguished? I don't think you can meaningfully choose which to
> > > > > > > > > > > > return unless you query both A and AAAA even when only one was
> > > > > > > > > > > > requested..?
> > > > > > > > > > > 
> > > > > > > > > > > EAI_ADDRFAMILY is only used when the host name is a numeric address that
> > > > > > > > > > > implies an address family, and a different address family is requested.
> > > > > > > > > > > EAI_NODATA implies that the host name exists, which doesn't really apply
> > > > > > > > > > > to a numeric address, so I guess that's why a different error code was
> > > > > > > > > > > introduced.
> > > > > > > 
> > > > > > > It seems that Linux (at least Ubuntu 22.04.1) uses EAI_ADDRFAMILY, or at
> > > > > > > least "Address family for hostname not supported", for the case where
> > > > > > > there is no address but the name is valid.  That was also part of the
> > > > > > > reason I proposed EAI_ADDRFAMILY for this case.
> > > > 
> > > > > > Are you sure? I couldn't find any indication of this in the glibc
> > > > > > source and couldn't get it to happen testing either.
> > > > 
> > > > > Hmm, my test case was ping6, as that was where I tripped over this
> > > > > on FreeBSD.  Now I see that ping6 is not representative on Ubuntu;
> > > > > no idea why.  Things like telnet and ftp say "No address associated
> > > > > with hostname".
> > > > 
> > > > Ubuntu behavior for the case where there is no address for the name
> > > > doesn't seem to match the getaddrinfo(3) man page, which has:
> > > > 
> > > >        EAI_ADDRFAMILY
> > > >               The  specified  network host does not have any network addresses
> > > >               in the requested address family.
> > > >        EAI_NODATA
> > > >               The specified network host exists, but does not have any network
> > > >               addresses defined.
> > > > 
> > > > EAI_ADDRFAMILY seems like the better match.  It also seems to be used as
> > > > described above for numeric addresses that don't match:
> > > > 
> > > > mike@...ntu:~$ telnet -6 127.0.0.1
> > > > telnet: could not resolve 127.0.0.1/telnet: Address family for hostname not supp
> > > > orted
> > > > 
> > > > I don't see that this means that the same error shouldn't be used for
> > > > another purpose that also matches the description.  However, EAI_NODATA
> > > > seems to be used now in this case.  There is something to be said for
> > > > consistency, although it would also be nice if the error string was
> > > > informative to the end user.  "No address associated with hostname"
> > > > seems to over-generalize.  The current FreeBSD situation for this error
> > > > produces "Name does not resolve", which is worse, and I want to fix.
> > > > 
> > > > Does anyone know why Linux/glibc does what it does?
> > 
> > > Distinguishing "no address" from "no address in the requested family"
> > > fundamentally requires spurious queries for the unrequested family. I
> > > would assume everyone deems this unnecessarily costly (not to mention
> > > error-prone -- some environments are using AI_ADDRCONFIG with ipv6
> > > disabled because they're behind broken middleboxes that barf on AAAA
> > > queries) just for the sake of distinguishing these cases that are
> > > otherwise semantically the same (the name exists but doesn't translate
> > > to an address in the requested form(s)).
> > 
> > No, you can tell the difference between "no address" from "no address
> > in the requested family" when using DNS as described above with a single
> > query (per name), and glibc seems to do this.

Sorry, I misunderstood you on this one.  I agree that "No address associated
with hostname" would require multiple queries, which is why I'd like to
avoid anything that implies that.

		Mike

> > If you do an AAAA query,
> > for example, and  there is an NXDomain error, the getaddrinfo error is
> > looks like EAI_NONAME:

> That's not the difference between "no address" and "no address in the
> requested family". It's the difference between "name does not exist"
> and "no address in the requested family".

> Folks get this wrong all the time, but it's really important. The name
> not existing (NxDomain) is very different from the name existing and
> not having the record you asked for (NODATA). That's the whole topic I
> started this thread for -- exposing this distinction correctly to the
> application so that it can act on the difference. There are various
> places it matters; some that come to mind are:

> - Applying DNSSEC and DANE logic where nonexistence has different
>   semantics.

> - Implementing a DNS gateway server on top of the libc getaddrinfo API
>   (several virtualization-oriented implementations have been caught
>   doing this wrong, specifically the NxDomain/NODATA distinction,
>   thereby breaking guests that care).

> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.