musl - Re: Resolver overhaul concepts

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140504190427.GA30981@brightrain.aerifal.cx>
Date: Sun, 4 May 2014 15:04:27 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Resolver overhaul concepts

On Sun, May 04, 2014 at 06:56:59PM +0100, Laurent Bercot wrote:
> >In order to make this the most useful, though, musl should
> >support nameservers on non-default ports (is there a standard syntax
> >for this, or can we support one without breaking anything?)
> 
>  I'm not aware of any standardized way of running DNS server/caches
> on anything else than the default port; but I don't see why it
> should be necessary. Anyone can run a translator daemon on localhost:53.

Requiring port 53 is not very prohibitive relative to resolv.conf and
nsswitch.conf which are impossible to override without root, but it's
slightly worse: it might be a problem if you also need a public DNS on
the same machine.

Of course if we want to make it possible to override the config on a
per-process basis, requiring port 53 is a fairly serious limitation.
Note that per-process override would be very nice, if nothing else, as
a means of testing; the test framework could setup a custom
resolv.conf and generate malicious packets, packets that don't answer
the query, etc. to test that libc handles them right.

> >and it
> >would also be nice to be able to override resolv.conf on a per-process
> >basis (e.g. via the environment).
> 
>  djbdns and s6-dns do this. It makes sense for every resolver to do it
> too; the problem for a libc is, again, namespace pollution. I suggest
> having a compile-time option (yes...) that enables musl-specific
> extensions, among which some environment variables in the MUSL_*
> namespace. You'll have to accept that, or something similar, at some
> point.

The aim is to use existing mechanisms when available as this
facilitates dropping programs into existing, already-configured
systems. However it may be necessary at some point to add further
options. This is an important topic to discuss at some point, maybe
soon.

> >There was a legacy file, /etc/host.conf, that allowed the order to be
> >changed, but changing the order seems rather useless to me. On the
> >other hand suppressing /etc/hosts could be useful in some instances.
> 
>  /etc/host.conf is actually used by libresolv itself. So with glibc,
> name resolution goes getaddrinfo() -> NSS -> /etc/nsswitch.conf ->
> /etc/hosts or DNS. If DNS: -> libresolv -> /etc/host.conf ->
> /etc/hosts or real DNS. That's the magic of glibc configuration, and
> of compatibility layers upon compatibility layers: /etc/hosts can
> actually be checked twice !
>  musl should not check /etc/host.conf itself: that is libresolv
> internals.

Well it's also (historically) a public libc configuration interface.
Since musl does not use libresolv (note: glibc doesn't really either,
except a seriously-forked version of parts of it) it would need to
check this file itself if we wanted to provide the same configuration
opportunity.

If we do want a way to turn off hosts processing and there's a
traditional way to do it via host.conf, I think supporting the
traditional way is better than invending a new one.

> A libc-level switching mechanism would be /etc/nsswitch.conf
> if anything, but parsing /etc/nsswitch.conf is too complex if it's
> simply about setting 2 boolean flags, so I suggest doing otherwise.

Yes, I don't really want anything to do with nss in musl anyway. :-)

> >4 suffixes times 2 RR's (A and AAAA) makes for 8 queries, which takes
> >4k to store the responses and up to 2k to store the queries.  That's
> >not too bad, but along with the address lists, file buffers, and other
> >stuff getaddrinfo has around, it's getting the stack usage up to the
> >point where getaddrinfo would probably be the biggest stack user in
> >musl
> 
>  IIRC, the 512 byte limit is only true for UDP responses, and when you
> get a truncated UDP response you have to retry with TCP, and there the
> maximum length is much more than 512 bytes. Do you just extract the
> response from truncated queries ?

Yes, tcp is not supported at all. I don't see any reason one would
need tcp for a non-recursive resolver. In principle a response just
needs a few more bytes than the request, plus 4 bytes per address (or
16 for AAAA), and the request size is bounded just above 256 bytes
(the max hostname length).

> Anyway getaddrinfo() is authorized
> to use heap memory, and apart from not handling TCP at all I don't see
> how you can avoid it. This would, paradoxically, save memory most of
> the time, because the typical query is short, as well as the typical
> response; and you're already using heap memory at some point in the
> current getaddrinfo(), so I don't understand the math of putting
> everything in the stack.

More complexity, more failure cases, and then it also depends on free
as opposed to just malloc.

Also it avoids additional fragmentation. If you have lots of threads
making DNS queries and frequently allocating and freeing small blocks,
it's conceivable that the allocation timing ends up breaking up
contiguous space that another thread wants (e.g. the other thread has
called malloc then calls realloc after the address just past its first
allocation is taken). From a standpoint of not making a fragmented
mess of the heap, it's best not to make unnecessary use of allocated
storage.

> >For asynchronous use, you call it from its own thread (or use the
> >getaddrinfo_a extension, which we don't yet provide but which is easy
> >to provide on your own and which I may add to musl since it's
> >convenient and ultra-light).
> 
>  Making a new thread just to work around a lack of asynchronous
> interfaces is ugly. (Remember Netscape Navigator's "dns_helper"
> subprocesses ?) That's the very reason getaddrinfo_a() exists.

A round trip network query (even to localhost) takes several times as
long as creating a thread (and for tcp, typically takes hundreds of
times the resources of thread creation since the kernel allocates
massively bloated send/recv buffers).

With Netscape's forked dns_helper, there are other costs like error
handling complexity when the helper is wrongly killed, etc., but most
of those don't apply to threads.

> >There is presently a hard-coded failure timeout of 5 seconds and a
> >retry time of 1 second. It would be nice to honor settings from
> >resolv.conf to tweak these.
> 
>  And the RES_TIMEOUT and RES_DFLRETRY environment variables, then, if
> you're going for libresolv compatibility.

I'm not sure if they have the same semantics, but it's doubtful that
anyone cares if they're exactly the same, so we could probably reuse
them. resolv.conf also has a mechanism for setting these.

> >Using a full-fledged DNS library to provide getaddrinfo is akin to
> >using GMP to provide printf...
> 
>  How so ? All the complex machinery of parsing the DNS protocol,
> parsing /etc/resolv.conf, talking from/to the network (in an asynchronous
> manner when "search" is implemented) with a retry policy, both UDP and TCP,
> and so on, has to be present already. I'm interested in learning the
> ninja coding techniques that allow you to write getaddrinfo without all
> that ! :)

See my printf analogy. printf needs decimal bignums, but only needs
two operations on them: << and >>. A general bignum implementation
that can do arbitrary operations on them is a lot more complex and
costly than an implementation that just does two operations in-place.

Similarly, getaddrinfo needs DNS, but it only needs generation of
fixed-form queries, minimal data extraction from result packets, and
some degree of validation.

FYI the current code is ~4k binary and the overhaul is not expected to
increase that much. I really doubt you could achieve that with general
DNS library code.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.