Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53667F6B.8040808@skarnet.org>
Date: Sun, 04 May 2014 18:56:59 +0100
From: Laurent Bercot <ska-dietlibc@...rnet.org>
To: musl@...ts.openwall.com
Subject: Re: Resolver overhaul concepts

On 04/05/2014 17:24, Rich Felker wrote:
> The policy for supporting something like nss has always been that musl
> implements a perfectly reasonable public protocol for providing any
> back-end you want: the DNS protocol. You can run a local daemon
> speaking DNS and serving names from any backend you like, and this is
> the correct way to achieve it (rather than linking random buggy,
> likely-not-namespace-clean libraries into the application's address
> space).

  That makes sense.


> In order to make this the most useful, though, musl should
> support nameservers on non-default ports (is there a standard syntax
> for this, or can we support one without breaking anything?)

  I'm not aware of any standardized way of running DNS server/caches
on anything else than the default port; but I don't see why it
should be necessary. Anyone can run a translator daemon on localhost:53.


> and it
> would also be nice to be able to override resolv.conf on a per-process
> basis (e.g. via the environment).

  djbdns and s6-dns do this. It makes sense for every resolver to do it
too; the problem for a libc is, again, namespace pollution. I suggest
having a compile-time option (yes...) that enables musl-specific
extensions, among which some environment variables in the MUSL_*
namespace. You'll have to accept that, or something similar, at some
point.


> There was a legacy file, /etc/host.conf, that allowed the order to be
> changed, but changing the order seems rather useless to me. On the
> other hand suppressing /etc/hosts could be useful in some instances.

  /etc/host.conf is actually used by libresolv itself. So with glibc,
name resolution goes getaddrinfo() -> NSS -> /etc/nsswitch.conf ->
/etc/hosts or DNS. If DNS: -> libresolv -> /etc/host.conf ->
/etc/hosts or real DNS. That's the magic of glibc configuration, and
of compatibility layers upon compatibility layers: /etc/hosts can
actually be checked twice !
  musl should not check /etc/host.conf itself: that is libresolv
internals. A libc-level switching mechanism would be /etc/nsswitch.conf
if anything, but parsing /etc/nsswitch.conf is too complex if it's
simply about setting 2 boolean flags, so I suggest doing otherwise.


> 4 suffixes times 2 RR's (A and AAAA) makes for 8 queries, which takes
> 4k to store the responses and up to 2k to store the queries.  That's
> not too bad, but along with the address lists, file buffers, and other
> stuff getaddrinfo has around, it's getting the stack usage up to the
> point where getaddrinfo would probably be the biggest stack user in
> musl

  IIRC, the 512 byte limit is only true for UDP responses, and when you
get a truncated UDP response you have to retry with TCP, and there the
maximum length is much more than 512 bytes. Do you just extract the
response from truncated queries ? Anyway getaddrinfo() is authorized
to use heap memory, and apart from not handling TCP at all I don't see
how you can avoid it. This would, paradoxically, save memory most of
the time, because the typical query is short, as well as the typical
response; and you're already using heap memory at some point in the
current getaddrinfo(), so I don't understand the math of putting
everything in the stack.


> For asynchronous use, you call it from its own thread (or use the
> getaddrinfo_a extension, which we don't yet provide but which is easy
> to provide on your own and which I may add to musl since it's
> convenient and ultra-light).

  Making a new thread just to work around a lack of asynchronous
interfaces is ugly. (Remember Netscape Navigator's "dns_helper"
subprocesses ?) That's the very reason getaddrinfo_a() exists.


> There is presently a hard-coded failure timeout of 5 seconds and a
> retry time of 1 second. It would be nice to honor settings from
> resolv.conf to tweak these.

  And the RES_TIMEOUT and RES_DFLRETRY environment variables, then, if
you're going for libresolv compatibility.


> Using a full-fledged DNS library to provide getaddrinfo is akin to
> using GMP to provide printf...

  How so ? All the complex machinery of parsing the DNS protocol,
parsing /etc/resolv.conf, talking from/to the network (in an asynchronous
manner when "search" is implemented) with a retry policy, both UDP and TCP,
and so on, has to be present already. I'm interested in learning the
ninja coding techniques that allow you to write getaddrinfo without all
that ! :)

-- 
  Laurent

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.