|
Message-Id: <202210061500.296F0HQT013548@mail.karels.net> Date: Thu, 06 Oct 2022 10:00:17 -0500 From: Mike Karels <mike@...els.net> To: libc-coord@...ts.openwall.com Subject: Re: EAI_NOADDR ? On Thu, 6 Oct 2022, Rich Felker wrote: > On Wed, Oct 05, 2022 at 09:05:49AM -0500, Mike Karels wrote: > > On Tue, 04 Oct 2022, Rich Felker wrote: > > > On Mon, Oct 03, 2022 at 12:47:21PM -0500, Mike Karels wrote: > > > > Replying to my own message: > > > > > > > > > On Wed, Sep 28, Rich Felker wrote: > > > > > > On Wed, Sep 28, 2022 at 02:19:23PM -0500, Mike Karels wrote: > > > > > > > Hi, I am Mike Karels, a FreeBSD committer. Coincidentally, I sent a > > > > > > > message on this subject to a FreeBSD list yesterday; you can see it at > > > > > > > https://lists.freebsd.org/archives/freebsd-net/2022-September/002461..html. > > > > > > > Kostik pointed me at this thread. > > > > > > > > > > > > > > On Wed, Sep 28 Hajimu UMEMOTO <ume@...eBSD.org> wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > >>>>> On Tue, 27 Sep 2022 23:36:20 +0300 > > > > > > > > >>>>> Konstantin Belousov <kostikbel@...il.com> said: > > > > > > > > > > > > > > > kostikbel> On Tue, Sep 20, 2022 at 03:29:35PM -0400, Rich Felker wrote: > > > > > > > > > On Tue, Sep 20, 2022 at 11:39:55AM +0300, Konstantin Belousov wrote: > > > > > > > > > > On Tue, Sep 20, 2022 at 10:28:16AM +0200, Florian Weimer wrote: > > > > > > > > > > > * Rich Felker: > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 10:57:55PM +0200, Florian Weimer wrote: > > > > > > > > > > > >> * Rich Felker: > > > > > > > > > > > >> > > > > > > > > > > > >> > One problem I've seen come up again and again with libc stub resolver > > > > > > > > > > > >> > API is that there's no way to distinguish between NxDomain and NODATA > > > > > > > > > > > >> > responses from DNS. These have very different meanings ("name doesn't > > > > > > > > > > > >> > exist" vs "name exists but has no address (or whatever record type you > > > > > > > > > > > >> > were looking for") and being able to distinguish them is important for > > > > > > > > > > > >> > implementing containerized-type DNS service on top of the host's > > > > > > > > > > > >> > resolver API rather than direct proxying to outside DNS (when the > > > > > > > > > > > >> > latter isn't desirable). > > > > > > > > > > > >> > > > > > > > > > > > > >> > POSIX defines EAI_NONAME as: > > > > > > > > > > > >> > > > > > > > > > > > > >> > [EAI_NONAME] > > > > > > > > > > > >> > The name does not resolve for the supplied parameters. > > > > > > > > > > > >> > > > > > > > > > > > > >> > which, under generous interpretation of "parameters", seems to cover > > > > > > > > > > > >> > both cases, although arguably it does "resolve" to just an empty list > > > > > > > > > > > >> > of addresses in the NODATA case. > > > > > > > > > > > >> > > > > > > > > > > > > >> > To address this, I'm considering proposing a new error code EAI_NOADDR > > > > > > > > > > > >> > that would be defined something like: > > > > > > > > > > > >> > > > > > > > > > > > > >> > [EAI_NOADDR] > > > > > > > > > > > >> > The name does not have any addresses for the supplied parameters. > > > > > > > > > > > >> > > > > > > > > > > > > >> > Would other implementators be on-board with such a proposal? > > > > > > > > > > > >> > > > > > > > > > > > >> I think several libcs implemented this as EAI_NODATA already. I see it > > > > > > > > > > > >> documented for AIX, glibc, NetBSD, OpenBSD, QNX, Solaris. Apparently, > > > > > > > > > > > >> it's absent from FreeBSD (and Windows). > > > > > > > > > > > > > > FreeBSD has EAI_NOADDR and EAI_ADDRFAMILY defined inside #if 0 in the > > > > > > > header, but still included in the error strings. EAI_NOADDR is "No > > > > > > > address associated with hostname", and EAI_ADDRFAMILY is "Address > > > > > > > family for hostname not supported." Based on these strings, I proposed > > > > > > > EAI_ADDRFAMILY for the case where the name was valid but had no > > > > > > > address for the address family, as opposed to "No address associated > > > > > > > with hostname" (which implies that there are no addresses at all). > > > > > > > > > > Distinguishing EAI_ADDRFAMILY vs EAI_NOADDR like this requires > > > > > > querying both A and AAAA even if the caller only requested one, which > > > > > > users would probably not be happy with as an added cost. > > > > > > > > > The BSD/FreeBSD resolver code distinguishes between NXDOMAIN (name > > > > > doesn't resolve), and zero answers of the type requested. The latter > > > > > might mean that there are addresses of other types, or records such > > > > > as NS, MX or others. That means the name is valid. fwiw, I have a > > > > > prototype of getaddrinfo() distinguishing the two by simply shuffling > > > > > error returns. It returns EAI_NONAME if there is an NXDOMAIN error, > > > > > or EAI_ADDRFAMILY if there is no address. As noted earlier, FreeBSD > > > > > does not currently use EAI_NODATA (or EAI_ADDRFAMILY). > > > > > > > > > > > fwiw, NetBSD and OpenBSD seem to use EAI_NOADDR, or at least that > > > > > > > error string, for both "name invalid" and "no address of requested > > > > > > > family". > > > > > > > > Oops, that's EAI_NODATA (No address associated with hostname). > > > > > > > > > > This is what we're leaning toward in musl for the reason above. > > > > > > > > Just to be sure: you mean using EAI_NODATA, or a new EAI_NOADDR? > > > > > Yes, I meant EAI_NODATA. EAI_NOADDR was the proposed name I introduced > > > it as in this thread, not remembering it was a thing some > > > implementations already had under the name EAI_NODATA, just not in the > > > standard. > > > > I'm torn between EAI_ADDRFAMILY (which has a better current error message > > in FreeBSD) and EAI_NODATA. I could change the error message for EAI_NODATA, > > but then it will sound close to EAI_ADDRFAMILY. Changing the English error > > message is easy, but we have several translations as well. > > > > Any other opinions on the best choice? I suppose glibc is unlikely to > > change. > Per the EAI names, I prefer EAI_NODATA. It corresponds directly to the > familar DNS condition and can reasonably mean "name exists but doesn't > have an address in any of the families you requested"; this is just a > special case of not having any addresses at all. I've been looking more at the error strings, which is what the user sees. EAI_NODATA is "No address associated with hostname" in both glibc and FreeBSD, which implies to me that there is no address in any family. That is over-generalized from this situation. The description in the Linux getaddrinfo(3) is even more definite: EAI_NODATA The specified network host exists, but does not have any network addresses defined. That is why I prefer EAI_ADDRFAMILY, and don't understand why glibc uses EAI_NODATA. Yes, the error string could be changed, but then it seems to have essentially the same meaning as EAI_ADDRFAMILY. > On the other hand, EAI_ADDRFAMILY comes across as implying > affirmatively that there *is* an address in at least one family, just > not the one(s) you requested. I would lean towards saying that it's > wrong for getaddrinfo to fail with EAI_ADDRFAMILY when AF_UNSPEC was > requested. I don't see that EAI_ADDRFAMILY implies that there is an address in another family. But EAI_NODATA still seems acceptable for AF_UNSPEC, if that does both A and AAAA queries (or "any" and follows CNAMEs). > So I think if we're going with just one of the two errors (not doing > the spurious queries to disambiguate them), EAI_NODATA is the > preferred choice. Anyone else have an opinion on this? > > > > > > > > Alternatively, I suppose EAI_ADDRFAMILY could be used for both cases > > > > > > (all NODATA responses), but that seems less intuitive and less inline > > > > > > with current practices on existing systems that have one or both of > > > > > > these error codes.x > > > > > > > > > > > > > > > > Oh, perfect! In that case, can we push this for standardization? > > > > > > > > > > > > > > > > > > > > > > I think a separate error code makes sense. > > > > > > > > > > > > > > > > > > > > > > > And, it looks like glibc also defines EAI_ADDRFAMILY with somewhat > > > > > > > > > > > > overlapping meaning. Is there good documentation for how they're > > > > > > > > > > > > distinguished? I don't think you can meaningfully choose which to > > > > > > > > > > > > return unless you query both A and AAAA even when only one was > > > > > > > > > > > > requested..? > > > > > > > > > > > > > > > > > > > > > > EAI_ADDRFAMILY is only used when the host name is a numeric address that > > > > > > > > > > > implies an address family, and a different address family is requested. > > > > > > > > > > > EAI_NODATA implies that the host name exists, which doesn't really apply > > > > > > > > > > > to a numeric address, so I guess that's why a different error code was > > > > > > > > > > > introduced. > > > > > > > > > > > > > > It seems that Linux (at least Ubuntu 22.04.1) uses EAI_ADDRFAMILY, or at > > > > > > > least "Address family for hostname not supported", for the case where > > > > > > > there is no address but the name is valid. That was also part of the > > > > > > > reason I proposed EAI_ADDRFAMILY for this case. > > > > > > > > > > Are you sure? I couldn't find any indication of this in the glibc > > > > > > source and couldn't get it to happen testing either. > > > > > > > > > Hmm, my test case was ping6, as that was where I tripped over this > > > > > on FreeBSD. Now I see that ping6 is not representative on Ubuntu; > > > > > no idea why. Things like telnet and ftp say "No address associated > > > > > with hostname". > > > > > > > > Ubuntu behavior for the case where there is no address for the name > > > > doesn't seem to match the getaddrinfo(3) man page, which has: > > > > > > > > EAI_ADDRFAMILY > > > > The specified network host does not have any network addresses > > > > in the requested address family. > > > > EAI_NODATA > > > > The specified network host exists, but does not have any network > > > > addresses defined. > > > > > > > > EAI_ADDRFAMILY seems like the better match. It also seems to be used as > > > > described above for numeric addresses that don't match: > > > > > > > > mike@...ntu:~$ telnet -6 127.0.0.1 > > > > telnet: could not resolve 127.0.0.1/telnet: Address family for hostname not supp > > > > orted > > > > > > > > I don't see that this means that the same error shouldn't be used for > > > > another purpose that also matches the description. However, EAI_NODATA > > > > seems to be used now in this case. There is something to be said for > > > > consistency, although it would also be nice if the error string was > > > > informative to the end user. "No address associated with hostname" > > > > seems to over-generalize. The current FreeBSD situation for this error > > > > produces "Name does not resolve", which is worse, and I want to fix. > > > > > > > > Does anyone know why Linux/glibc does what it does? > > > > > Distinguishing "no address" from "no address in the requested family" > > > fundamentally requires spurious queries for the unrequested family. I > > > would assume everyone deems this unnecessarily costly (not to mention > > > error-prone -- some environments are using AI_ADDRCONFIG with ipv6 > > > disabled because they're behind broken middleboxes that barf on AAAA > > > queries) just for the sake of distinguishing these cases that are > > > otherwise semantically the same (the name exists but doesn't translate > > > to an address in the requested form(s)). > > > > No, you can tell the difference between "no address" from "no address > > in the requested family" when using DNS as described above with a single > > query (per name), and glibc seems to do this. Sorry, I misunderstood you on this one. I agree that "No address associated with hostname" would require multiple queries, which is why I'd like to avoid anything that implies that. Mike > > If you do an AAAA query, > > for example, and there is an NXDomain error, the getaddrinfo error is > > looks like EAI_NONAME: > That's not the difference between "no address" and "no address in the > requested family". It's the difference between "name does not exist" > and "no address in the requested family". > Folks get this wrong all the time, but it's really important. The name > not existing (NxDomain) is very different from the name existing and > not having the record you asked for (NODATA). That's the whole topic I > started this thread for -- exposing this distinction correctly to the > application so that it can act on the difference. There are various > places it matters; some that come to mind are: > - Applying DNSSEC and DANE logic where nonexistence has different > semantics. > - Implementing a DNS gateway server on top of the libc getaddrinfo API > (several virtualization-oriented implementations have been caught > doing this wrong, specifically the NxDomain/NODATA distinction, > thereby breaking guests that care). > Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.