musl - Re: TCP support in the stub resolver

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200501220238.GP21576@brightrain.aerifal.cx>
Date: Fri, 1 May 2020 18:02:38 -0400
From: Rich Felker <dalias@...c.org>
To: Florian Weimer <fw@...eb.enyo.de>
Cc: musl@...ts.openwall.com
Subject: Re: TCP support in the stub resolver

On Tue, Apr 21, 2020 at 07:26:08PM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> >> I'm excited that Fedora plans to add a local caching resolver by
> >> default.  It will help with a lot of these issues.
> >
> > That's great news! Will it be DNSSEC-enforcing by default?
> 
> No.  It is currently not even DNSSEC-aware, in the sense that you
> can't get any DNSSEC data from it.  That's the sad part.

That's really disappointing. Why? Both systemd-resolved and dnsmasq,
the two reasonable (well, reasonable for distros using systemd already
in the systemd-resolved case :) options for this, support DNSSEC fully
as I understand it. Is it just being turned off by default because of
risk of breaking things, or is some other implementation that lacks
DNSSEC being used?

> >> > BTW, am I mistaken or can TCP fastopen make it so you can get a DNS
> >> > reply with no additional round-trips? (query in the payload with
> >> > fastopen, response sent immediately after SYN-ACK before receiving ACK
> >> > from client, and nobody has to wait for connection to be closed) Of
> >> > course there are problems with fastopen that lead to it often being
> >> > disabled so it's not a full substitute for UDP.
> >> 
> >> There's no handshake to enable it, so it would have to be an
> >> /etc/resolv.conf setting.  It's also not clear how you would perform
> >> auto-detection that works across arbitrary middleboxen.  I don't think
> >> it's useful for an in-process stub resolver.
> >
> > The kernel automatically does it,
> 
> Surely not, it causes too many interoperability issues for that.  It's
> also difficult to fit it into the BSD sockets API.  As far as I can
> see, you have to use sendmsg or sendto with MSG_FASTOPEN instead of a
> connect call to establish the connection.
> 
> (When the kernel says that it's enabled by default, it means that you
> can use MSG_FASTOPEN with sysctl tweaks.)

What I mean is that, if you use MSG_FASTOPEN on a kernel new enough to
understand it, I think it makes a normal TCP connection and sends the
data if fastopen is not enabled or not supported by the remote host,
but uses fastopen as long as it's enabled and supported. In this sense
it's automatic. But of course we'd have to fallback explicitly anyway
if it's not supported in order to maintain compatibility with older
kernels.

> >> Above 4096 bytes, pretty much all recursive resolvers will send TC
> >> responses even if the client offers a larger buffer size.  This means
> >> for correctness, you cannot do away with TCP support.
> >
> > In that case doing EDNS at all seems a lot less useful. Fragmentation
> > is always a possibility above min MTU (essentially same limit as
> > original UDP DNS) and the large responses are almost surely things you
> > do want to avoid forgery on, which leads me back around to thinking
> > that if you want them you really really need to be running a local
> > DNSSEC validating nameserver and then can just use-vc...
> 
> Why use use-vc at all?  Some software *will* break because it assumes
> that certain libc calls do not keep open some random file descriptor.

Does use-vc do that (keep the fd open) in glibc? It doesn't seem to be
documented that way, just as forcing use of tcp, and my intent was not
to keep any fd open (since you need a separate fd per query anyway to
do them in parallel or in case the server closes the socket after one
reply).

> >> Some implementations have used a longer sequence of transports: DNS
> >> over UDP, EDNS over UDP, and finally TCP.  That avoids EDNS
> >> pseudo-negotiation until it is actually needed.  I'm not aware of any
> >> stub resolvers doing that, though.
> >
> > Yeah, each fallback is just going to increase total latency though,
> > very badly if they're all remote.
> >
> > Actually, the current musl approach adapted to this would be to just
> > do them all concurrently: DNS/UDP, EDNS/UDP, and DNS/TCP, and accept
> > the first answer that's not truncated or broken server
> > (servfail/formerr/notimp), basically same as we do now but with more
> > choices. But that's getting heavier on unwanted network traffic...
> 
> Aggressive parallel queries tend to break middleboxes.  Even A/AAAA is
> problematic.  Good interoperability and good performance are difficult
> to obtain, particularly from short-lived processes.

Yes, and currently we do them anyway and just don't care. It's
possible that there are users who are just working around this by not
configuring IPv6 and only using apps that call gethostbyname (ipv4
only) or use AI_ADDRCONFIG, but the latter was not supported at all in
musl until fairly recently, and it only takes effect if you _fully_
disable IPv6 (including on lo and link-local addrs), so I'd think
someone would complain if it were a real problem.

If sending queries with AD bit set is less of a compatibility issue
than parallel queries, I think we can probably just do it
unconditionally. And if anyone really _really_ wants to run in an
environment with broken nameservers, iptables should be able to
reject, redirect, or rewrite packets as needed to get something the
broken server can handle...

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.