Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <202210031747.293HlLOJ044048@mail.karels.net>
Date: Mon, 03 Oct 2022 12:47:21 -0500
From: Mike Karels <mike@...els.net>
To: libc-coord@...ts.openwall.com
cc: Konstantin Belousov <kostikbel@...il.com>
Subject: Re: EAI_NOADDR ?

Replying to my own message:

> On Wed, Sep 28, Rich Felker wrote:
> > On Wed, Sep 28, 2022 at 02:19:23PM -0500, Mike Karels wrote:
> > > Hi, I am Mike Karels, a FreeBSD committer.  Coincidentally, I sent a
> > > message on this subject to a FreeBSD list yesterday; you can see it at
> > > https://lists.freebsd.org/archives/freebsd-net/2022-September/002461.html.
> > > Kostik pointed me at this thread.
> > > 
> > > On Wed, Sep 28 Hajimu UMEMOTO <ume@...eBSD.org> wrote:
> > > > Hi,
> > > 
> > > > >>>>> On Tue, 27 Sep 2022 23:36:20 +0300
> > > > >>>>> Konstantin Belousov <kostikbel@...il.com> said:
> > > 
> > > > kostikbel> On Tue, Sep 20, 2022 at 03:29:35PM -0400, Rich Felker wrote:
> > > > > On Tue, Sep 20, 2022 at 11:39:55AM +0300, Konstantin Belousov wrote:
> > > > > > On Tue, Sep 20, 2022 at 10:28:16AM +0200, Florian Weimer wrote:
> > > > > > > * Rich Felker:
> > > > > > > 
> > > > > > > > On Mon, Sep 19, 2022 at 10:57:55PM +0200, Florian Weimer wrote:
> > > > > > > >> * Rich Felker:
> > > > > > > >> 
> > > > > > > >> > One problem I've seen come up again and again with libc stub resolver
> > > > > > > >> > API is that there's no way to distinguish between NxDomain and NODATA
> > > > > > > >> > responses from DNS. These have very different meanings ("name doesn't
> > > > > > > >> > exist" vs "name exists but has no address (or whatever record type you
> > > > > > > >> > were looking for") and being able to distinguish them is important for
> > > > > > > >> > implementing containerized-type DNS service on top of the host's
> > > > > > > >> > resolver API rather than direct proxying to outside DNS (when the
> > > > > > > >> > latter isn't desirable).
> > > > > > > >> >
> > > > > > > >> > POSIX defines EAI_NONAME as:
> > > > > > > >> >
> > > > > > > >> > [EAI_NONAME]
> > > > > > > >> >     The name does not resolve for the supplied parameters. 
> > > > > > > >> >
> > > > > > > >> > which, under generous interpretation of "parameters", seems to cover
> > > > > > > >> > both cases, although arguably it does "resolve" to just an empty list
> > > > > > > >> > of addresses in the NODATA case.
> > > > > > > >> >
> > > > > > > >> > To address this, I'm considering proposing a new error code EAI_NOADDR
> > > > > > > >> > that would be defined something like:
> > > > > > > >> >
> > > > > > > >> > [EAI_NOADDR]
> > > > > > > >> >     The name does not have any addresses for the supplied parameters.
> > > > > > > >> >
> > > > > > > >> > Would other implementators be on-board with such a proposal?
> > > > > > > >> 
> > > > > > > >> I think several libcs implemented this as EAI_NODATA already.  I see it
> > > > > > > >> documented for AIX, glibc, NetBSD, OpenBSD, QNX, Solaris.  Apparently,
> > > > > > > >> it's absent from FreeBSD (and Windows).
> > > 
> > > FreeBSD has EAI_NOADDR and EAI_ADDRFAMILY defined inside #if 0 in the
> > > header, but still included in the error strings.  EAI_NOADDR is "No
> > > address associated with hostname", and EAI_ADDRFAMILY is "Address
> > > family for hostname not supported."  Based on these strings, I proposed
> > > EAI_ADDRFAMILY for the case where the name was valid but had no
> > > address for the address family, as opposed to "No address associated
> > > with hostname" (which implies that there are no addresses at all).

> > Distinguishing EAI_ADDRFAMILY vs EAI_NOADDR like this requires
> > querying both A and AAAA even if the caller only requested one, which
> > users would probably not be happy with as an added cost.

> The BSD/FreeBSD resolver code distinguishes between NXDOMAIN (name
> doesn't resolve), and zero answers of the type requested.  The latter
> might mean that there are addresses of other types, or records such
> as NS, MX or others.  That means the name is valid.  fwiw, I have a
> prototype of getaddrinfo() distinguishing the two by simply shuffling
> error returns.  It returns EAI_NONAME if there is an NXDOMAIN error,
> or EAI_ADDRFAMILY if there is no address.  As noted earlier, FreeBSD
> does not currently use EAI_NODATA (or EAI_ADDRFAMILY).

> > > fwiw, NetBSD and OpenBSD seem to use EAI_NOADDR, or at least that
> > > error string, for both "name invalid" and "no address of requested
> > > family".

Oops, that's EAI_NODATA (No address associated with hostname).

> > This is what we're leaning toward in musl for the reason above.

Just to be sure: you mean using EAI_NODATA, or a new EAI_NOADDR?

> > Alternatively, I suppose EAI_ADDRFAMILY could be used for both cases
> > (all NODATA responses), but that seems less intuitive and less inline
> > with current practices on existing systems that have one or both of
> > these error codes.x

> > > > > > > > Oh, perfect! In that case, can we push this for standardization?
> > > > > > > 
> > > > > > > I think a separate error code makes sense.
> > > > > > > 
> > > > > > > > And, it looks like glibc also defines EAI_ADDRFAMILY with somewhat
> > > > > > > > overlapping meaning. Is there good documentation for how they're
> > > > > > > > distinguished? I don't think you can meaningfully choose which to
> > > > > > > > return unless you query both A and AAAA even when only one was
> > > > > > > > requested..?
> > > > > > > 
> > > > > > > EAI_ADDRFAMILY is only used when the host name is a numeric address that
> > > > > > > implies an address family, and a different address family is requested.
> > > > > > > EAI_NODATA implies that the host name exists, which doesn't really apply
> > > > > > > to a numeric address, so I guess that's why a different error code was
> > > > > > > introduced.
> > > 
> > > It seems that Linux (at least Ubuntu 22.04.1) uses EAI_ADDRFAMILY, or at
> > > least "Address family for hostname not supported", for the case where
> > > there is no address but the name is valid.  That was also part of the
> > > reason I proposed EAI_ADDRFAMILY for this case.

> > Are you sure? I couldn't find any indication of this in the glibc
> > source and couldn't get it to happen testing either.

> Hmm, my test case was ping6, as that was where I tripped over this
> on FreeBSD.  Now I see that ping6 is not representative on Ubuntu;
> no idea why.  Things like telnet and ftp say "No address associated
> with hostname".

Ubuntu behavior for the case where there is no address for the name
doesn't seem to match the getaddrinfo(3) man page, which has:

       EAI_ADDRFAMILY
              The  specified  network host does not have any network addresses
              in the requested address family.
       EAI_NODATA
              The specified network host exists, but does not have any network
              addresses defined.

EAI_ADDRFAMILY seems like the better match.  It also seems to be used as
described above for numeric addresses that don't match:

mike@...ntu:~$ telnet -6 127.0.0.1
telnet: could not resolve 127.0.0.1/telnet: Address family for hostname not supp
orted

I don't see that this means that the same error shouldn't be used for
another purpose that also matches the description.  However, EAI_NODATA
seems to be used now in this case.  There is something to be said for
consistency, although it would also be nice if the error string was
informative to the end user.  "No address associated with hostname"
seems to over-generalize.  The current FreeBSD situation for this error
produces "Name does not resolve", which is worse, and I want to fix.

Does anyone know why Linux/glibc does what it does?

		Mike

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.