Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sat, 31 Mar 2018 18:08:21 +0200
From: Florian Weimer <fw@...eb.enyo.de>
To: Rich Felker <dalias@...c.org>
Cc: William Pitcock <nenolod@...eferenced.org>,  musl@...ts.openwall.com
Subject: Re: [PATCH] resolver: only exit the search path loop there are a positive number of results given

* Rich Felker:

>> I'm not entirely convinced that using different search path domains
>> for different address families is necessarily wrong.
>
> It breaks the completely reasonable application expectation that the
> results produced by AF_INET and AF_INET6 queries are subsets of the
> results produced by AF_UNSPEC. The proper application idiom is to use
> AF_UNSPEC (or no hints) and respect the order the results are returned
> in, in order to honor RFC 3484/gai.conf or any other means by which
> getaddrinfo determines which order results should be tried in. It's
> (IMO at least) utterly wrong to try to merge results from different
> search domains, but I can see applications trying both queries
> separately when they encounter the inconsistency...

Well, yes, but I'm not sure you can get consistent behavior without
always issuing two queries.  And least not if you want to stay
compatible with certain forms of DNSSEC online signing.

>> Historically,
>> the NODATA/NXDOMAIN signaling has been really inconsistent anyway, and
>> I suspect it still is for some users.
>
> Do you have a reference for this? AFAIK it was very consistent in all
> historical implementations. It's also documented (in RFC-????...I
> forget where but I looked it up during this).

Today, I expected that it is consistent among the major
implementations, mainly due to DNSSEC influence.

Some load balancers returned NXDOMAIN for AAAA queries.  I'm not sure
if F5 was one of them, but this document suggest something in this
direction:

  https://support.f5.com/csp/article/K2345

Here's a report of this issue:

  https://www.nanog.org/mailinglist/mailarchives/old_archive/2002-04/msg00559.html

Here's a more concrete bug report about MaraDNS:

  http://maradns.samiam.org/old-list-archive/2009-October/000476.html

(Which is surprisingly recent, but then, non-lower-case domain names
are probably quite rare.)

Peter van Dijk reports something else, some form of NODATA-to-NXDOMAIN
escalation:

  https://blog.powerdns.com/2015/03/02/from-noerror-to-refused/

Although that doesn't happen on the stub resolver interface, it shows
that the behavior still isn't as uniform as we would like it to be.

>> > Kubernetes imposes a default search path with the cluster domain last, so:
>> > 
>> >   - local.prod.svc.whatever
>> >   - prod.svc.whatever
>> >   - svc.whatever
>> >   - yourdomain.com
>> 
>> Do you have a source for that?
>> 
>> Considering that glibc had for a long time a hard limit at six
>> entries, I find that approach rather surprising.  This leaves just
>> three domains in the end user's context.  That's not going to be
>> sufficient for many users.  Anyway …
>
> k8s isn't software you install as a package on your user system. It's
> cloud/container stuff, where it wouldn't make sense to add more search
> domains beyond the ones for your application.

>From what I've heard, quite a few people use it to run older software
which interacts with the corporate network.  Even before, the six
domain limit was quite low for some deployments (and some sites
apparently stuck to NIS because of its server-side search list
processing).  All I'm saying is that it's a curious choice due to the
compatibility and performance issues involved.

> Yes. ndots>1 is utterly awful -- it greatly increases latency of every
> lookup, and has failure modes like what we're seeing now -- but the
> k8s folks designed stuff around it. Based on conversations when musl
> added search domains, I think there are people on the k8s side that
> realize this was a bad design choice and want to fix it, but that
> probably won't be easy to roll out to everyone and I have no idea if
> it's really going to happen.

They probably do not want to maintain an NSS module for that. 8-> 

In the past, in the container context, there have also been reports
about injecting the recursive resolver endpoint, so that it appears in
the container under a 127.0.0.0/8 address.  I don't know if that has
been solved completely.  I suspect a DNS transport over a UNIX domain
socket would help here.

For the search path problems, we would need a DNS protocol extension
which transfers search path expansion to the recursive resolver.  I'm
not sure if this is worth the effort, and how many glibc-based
distributions would be willing to backport a patch for that in a
relatively timely fashion.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.