Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 8 Mar 2024 15:44:58 -0800
From: David Schinazi <dschinazi.ietf@...il.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: mDNS in musl

On Fri, Mar 8, 2024 at 2:54 PM Rich Felker <dalias@...c.org> wrote:

> On Fri, Mar 08, 2024 at 01:55:18PM -0800, David Schinazi wrote:
> > On Fri, Mar 8, 2024 at 12:31 PM Rich Felker <dalias@...c.org> wrote:
> >
> > > On Fri, Mar 08, 2024 at 11:15:52AM -0800, David Schinazi wrote:
> > > > On Fri, Mar 8, 2024 at 5:30 AM Rich Felker <dalias@...c.org> wrote:
> > > >
> > > > > On Thu, Mar 07, 2024 at 08:47:20PM -0800, David Schinazi wrote:
> > > > > > Thanks. How would you feel about the following potential
> > > configuration
> > > > > > design?
> > > > > > * Add a new configuration option "send_mdns_unicast"
> > > > > > * When true, use the current behavior
> > > > > > * When false, send the query on all non-loopback non-p2p
> interfaces
> > > > > > * Have send_mdns_unicast default to false
> > > > > >
> > > > > > I was thinking through how to pick interfaces, looked up what
> other
> > > mDNS
> > > > > > libraries do, and pretty much all of them don't allow configuring
> > > > > > interfaces, whereas Avahi exposes allow-interfaces and
> > > deny-interfaces.
> > > > > I'm
> > > > > > leaning towards not making this configurable to reduce
> complexity. I
> > > > > think
> > > > > > that anyone interested in that level of config is probably using
> > > Avahi
> > > > > > anyway.
> > > > > >
> > > > > > Additionally this design has two nice properties: the default
> > > behavior is
> > > > > > RFC-compliant, and it means that for my use-case I don't need to
> > > change
> > > > > the
> > > > > > config file, which was a big part of my motivation for doing this
> > > inside
> > > > > of
> > > > > > musl in the first place :-)
> > > > >
> > > > > As discussed in this thread, I don't think so. The biggest
> problems I
> > > > > initially brought up were increased information leakage in the
> default
> > > > > configuration and inability to control where the traffic goes when
> you
> > > > > do want it on. The above proposal just reverts to the initial,
> except
> > > > > for providing a way to opt-out.
> > > > >
> > > > > For the most part, mDNS is very much a "home user, personal device
> on
> > > > > trusted network" thing. Not only do you not want it to default on
> > > > > because a lot of systems will be network servers on networks where
> > > > > it's not meaningful (and can be a weakness that aids attackers in
> > > > > lateral movement), but you also don't want it on when connected to
> > > > > public wifi. For example if you have an open browser tab to
> > > > > http://mything.local, and migrate to an untrusted network (with
> your
> > > > > laptop, tablet, phone, whatever), now your browser will be leaking
> > > > > private data (likely at least session auth tokens, maybe more) to
> > > > > whoever answers the mDNS query for mything.local.
> > > >
> > > > That's not quite right. The security properties of mDNS and DNS are
> the
> > > > same. DNS is inherently insecure, regardless of unicast vs
> multicast. If
> > > > I'm on a coffee shop Wi-Fi, all my DNS queries are sent in the clear
> to
> > > > whatever IP address the DHCP server gave me.
> > >
> > > That's not the case. Connections to non-mDNS hosts are authenticated
> > > by TLS with certificates issued on the basis of ownership of the
> > > domain name. That's not possible with mDNS hostnames, so they'll
> > > either be no-TLS or self-signed certs. That's why the above attack is
> > > possible. It was also possible with normal DNS in the bad old days of
> > > http://, but that time is long gone.
> >
> > Apologies for being pedantic, but that's not true. The ability to get TLS
> > certificates for a domain name that you own is a property of the WebPKI,
> > not a property of TLS. What you wrote is true, but only in the context
> of a
> > Web browser with an unmodified root certificate store. The features I
> > mentioned above don't use the WebPKI, they have a separate root of trust.
> > For example, some of those Apple features exchange TLS certificates via
> an
> > out-of-band mechanism such as Apple trusted servers. Another example is
> the
> > Apple Watch: when you first pair a new Apple Watch with an iPhone, they
> > exchange ed25519 public keys. Then any time the watch wants to transfer a
> > large file to/from the phone, it'll connect to Wi-Fi, use mDNS to find
> the
> > phone, and set up an IKEv2/IPsec tunnel that then protects the exchange.
> > It's resilient to any attacks at the mDNS level.
> >
> > You're absolutely right that the security of Web requests using local
> > connectivity is completely broken by the lack of WebPKI certificates for
> > those. But sending the DNS query over multicast as opposed to unencrypted
> > unicast to an untrusted DNS server doesn't change the security
> properties.
> > In your example above, the open tab to http://mything.local will send
> that
> > query to the recursive resolver - and if that's the one received by DHCP
> > then that server can reply with its own address and receive your auth
> > tokens. One potential fix here is to configure your resolv.conf to
> > localhost and then apply policy in that local resolver. But in practice,
> > application developers don't rely on security at that layer, they assume
> > that DNS is unsafe and implement encryption in userspace with some out of
> > band trust mechanism.
>
> My specific example was http://mything.local in a web browser, which
> is the way you access lots of mDNS-enabled things in the absence of a
> specific software ecosystem like Apple's. Since we're talking about
> musl which would be running on Linux or a Linux-syscall-compatible
> environment, without Apple apps, I think that's the main way anyone
> would be using hypothetical mDNS support. And indeed this is the way
> you access many printers, 3D printers, IP cameras, etc.
>

I have multiple services at home that use HTTP and mDNS to communicate
with. But they're built knowing that unencrypted HTTP is unsafe. For
example, one of my servers doesn't have any authentication - my browser
just uses unauthenticated GETs, POSTs and WebSockets. If I leave the tab
open and go to a coffee shop, my browser might send that GET to a server I
don't trust but that request won't carry any sensitive information. Another
of my servers uses TLS with self-signed certs, so every time I want to
communicate with it, I need to click through my browser's "this is unsafe"
interstitial to get to the page. If I switch networks, the browser will
send me the warning again and I'll know not to click through when I'm not
at home. In both of those cases, the security is handled (or not handled at
all) at the application layer.

Maybe at some point we'll have a good framework for authenticating
> this kind of usage with certificates (probably certificate pinning on
> first use, with good UX, is the only easy solution),


Trust on first use works, or even better there are emerging solutions that
leverage codes printed on devices and PAKEs so that a device on the
untrusted network can't even hijack the first connection without having
access to that code. The leading one for home automation is Matter [1].
Coincidentally, it also leverages mDNS for discovery, and doesn't rely on
security at the DNS level.

[1] https://csa-iot.org/all-solutions/matter/

but at present,
> mDNS devices on the .local zone get accessed with plain http:// all
> the time, and this means it's unsafe to do mDNS on
> public/untrusted/hostile networks.
>

The notion of something being "unsafe" (and security in general) is
predicated on the existence of a threat model. It's unsafe to use
unencrypted HTTP to your bank when your threat model includes someone on
the coffee shop Wi-Fi trying to steal your bank credentials. Conversely,
it's safe for me to print to this coffee shop printer if my threat model
assumes that I'm ok with the owner of the coffee shop seeing my document.
Another example is Chromecast which also uses mDNS: from Chrome on a Linux
laptop, I can cast YouTube videos to the TV in this coffee shop. That's
safe because I trust the network with the YouTube link I'm telling the TV
to play. mDNS is not in and of itself safe or unsafe. It converts
names into addresses, and what you do with those addresses can potentially
be unsafe.

That doesn't mean that every single use of mDNS on untrusted networks is
safe. If someone builds a web page that sends valuable secrets over
unencrypted HTTP to a .local name, then you have a security problem. But my
point is that this security problem needs to be solved at the application
layer and not at the DNS layer. That said, I agree that having a way to
disable mDNS on a machine is a good idea, because there probably are users
out there that are stuck with applications that for some reason decided to
rely on DNS being secure.

In terms of the tradeoff between usability and security, the default to me
lies with default-enabling mDNS on all interfaces as Apple and Avahi do.
But this tradeoff is between two metrics that can't be quantified one
against the other for all possible uses, so I totally understand if your
opinion for musl is that the tradeoff there is different than in other
situations. You know your users better than I do.

> > So the stack has to deal with
> > > > the fact that any DNS response can be spoofed.
> > >
> > > That's also not possible with DNSSEC, but only helps if you're
> > > validating it.
> > >
> > > > The most widely used
> > > > solution is TLS: a successful DNS hijack can prevent you from
> accessing a
> > > > TLS service, but can't impersonate it. That's true of both mDNS and
> > > regular
> > > > unicast DNS. As an example, all Apple devices have mDNS enabled on
> all
> > > > interfaces, with no security impact - the features that rely on it
> > > > (AirDrop, AirPlay, contact sharing, etc) all use mTLS to ensure
> they're
> > > > talking to the right device regardless of the correctness of DNS.
> > > (Printing
> > > > remains completely insecure, but that's also independent of DNS -
> your
> > > > coffee shop Wi-Fi access point can attack you at the IP layer too).
> One
> > > > might think that DNSSEC could save us here, but it doesn't. DNSSEC
> was
> > > > unfortunately built with a fundamental design flaw: it requires you
> to
> > > > trust all resolvers on the path, including recursive resolvers. So
> even
> > > if
> > > > you ask for DNSSEC validation of the DNS records for www.example.com
> ,
> > > your
> > > > coffee shop DNS recursive resolver can tell you "I checked, and
> > > example.com
> > > > does not support DNSSEC, here's the IP address for www.example.com
> > > though"
> > > > and you have to accept it.
> > >
> > > This is a completely false but somehow persistent myth about DNSSEC.
> > > You cannot lie that a zone does not support DNSSEC. The only way to
> > > claim a zone does not support DNSSEC is with a signature chain from
> > > the DNS root proving the nonexistence of the DS records for the
> > > delegation. Without that, the reply is BOGUS and will be ignored as if
> > > there was no reply at all.
> >
> > I was talking about the case where the recursive resolver does the
> > validation, which is what's deployed in practice today. What you wrote is
> > only true if the client does the DNSSEC validation itself. Most clients
> > don't do that today, because too many domains are just misconfigured and
> > broken. Eric Rescorla (the editor of the TLS RFCs) wrote a great blog
> post
> > about this:
>
> The consensus of folks in the stub resolver space (at least glibc+musl
> and I would assume the BSDs as well) is that the way you do DNSSEC
> validation is by having a validating caching proxy or full recursive
> resolver on localhost. Doing validation in the stub resolver is not
> viable because it may be static-linked, where it would not be able to
> be updated with new algorithms, root-of-trust, etc.


No disagreement there. By "client" I meant the client device as a whole,
and by "recursive resolver" I meant "the DNS server you got from DHCP".
Running a DNSSEC-validating recursive resolver on the client device falls
into what I meant by "if the client does the DNSSEC validation itself".
Sorry for being unclear.


> This is one of the
> reasons our go-to response for new functionality wanted in the stub
> resolver is "do it in a nameserver on localhost" -- because you
> already need that to do DNSSEC.
>

That makes sense. I wasn't working with the assumption that DNSSEC was a
requirement.

It really did not sound like you were talking about trusting the
> recursive, though. You called it a "fundamental design flaw", which it
> is not, and said it requires you to "trust all resolvers on the path",
> which it does not. It only requires you to trust the immediate
> resolver you are interacting with (and not even that if you put the
> validation in the stub resolver, but there are good reasons not to do
> that, as above). A pure-proxying server that relies on upstream
> recursives can do full DNSSEC validation. Dnsmasq is a canonical
> example. I believe systemd-resolvd also does it.
>

That's fair, and I apologize for overstating my point. I absolutely agree
that if you run a validating recursive resolver locally, then the attack I
described isn't possible. When DNSSEC was designed, it was intended to be
deployed in the model I described, where the validating recursive resolver
is not on-device. And that's how it is still mostly deployed today because
almost all general-purpose client devices do not validate locally. My
mental model is very focused around consumer devices where folks buy them
and use them without ever changing default settings. That might be a
portion of musl users, but you clearly also have advanced users that do
things differently.

> > > Regarding untrusted networks, one thing I hadn't considered yet is
> > > > > that a network configurator probably needs a way to setup
> resolv.conf
> > > > > such that .local queries temp-fail rather than perma-fail (as they
> > > > > would if you just sent the query to public dns) to use during
> certain
> > > > > race windows while switching networks. IOW "send .local queries to
> > > > > configured nameservers" and "treat .local specially but with an
> empty
> > > > > list of interfaces to send to" should be distinct configurations.
> > > >
> > > > Yeah, caching negative results in DNS has been a tricky thing from
> the
> > > > start. You probably could hack something by installing a fake SOA
> record
> > > > for .local. in your recursive resolver running on localhost. But the
> > > > RFC-compliant answer is for stub resolvers to treat it specially and
> know
> > > > that those often never get an answer (musl doesn't cache DNS results
> so
> > > in
> > > > a way we're avoiding this problem altogether at the stub resolver).
> > >
> > > The problem here is not about caching, just about clients using a
> > > response. You want a task (like a browser with open tabs) trying to
> > > contact the site to get a tempfail rather than NxDomain which might
> > > make it stop trying. But you probably want NxDomain if mDNS has been
> > > disabled entirely, so that every .local lookup doesn't hang 5 seconds
> > > or whatever before saying "inconclusive".
> >
> > I'm assuming that by tempfail you mean EAI_AGAIN. The two browsers that
> > I've written code in don't use that (Chrome just treats it the same as a
> > resolution failure and will automatically refresh the tab on a network
> > change; Safari doesn't use getaddrinfo and instead relies on an
> > asynchronous DNS API that adds results as they come in - I wrote that
> > algorithm up in RFC 8305). All that said, synchronous blocking APIs like
> > getaddrinfo need to eventually return even if no one replies, so
> EAI_AGAIN
> > makes sense in that case - whereas if .local is blocked by policy then
> > immediately returning EAI_NONAME is best.
>
> Right. Even if applications don't currently distinguish them well,
> returning EAI_AGAIN vs EAI_NONAME is meaningful and enables them to do
> the right thing.
>

Agreed.

Thinking back to our discussion about whether to disable mDNS when the
resolver is on localhost. I still agree that from an ergonomics
perspective, using configs to mean multiple things isn't great. But
focusing just on the security properties for a second: if resolv.conf is
configured to an IP address that is routed over a given non-loopback
interface, the current status quo is to send the .local query unsecured
over that interface. So if we were to, in that specific scenario, instead
send the query over multicast, but only on that interface - then we
wouldn't measurably change the security properties of the system. In
practice there is a slight difference where now you can be attacked by any
device on the network as opposed to only by the router on that network, but
I'd argue that there's no meaningful threat model that distinguishes
between those two attacks. So that would be a safe default option. But
again, your points about least surprise are still valid, so if you object
to that on those grounds I can't disagree.

David

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.