Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240321120727.GI15722@brightrain.aerifal.cx>
Date: Thu, 21 Mar 2024 08:07:27 -0400
From: Rich Felker <dalias@...c.org>
To: David Schinazi <dschinazi.ietf@...il.com>
Cc: musl@...ts.openwall.com
Subject: Re: mDNS in musl

On Thu, Mar 21, 2024 at 07:21:05PM +1000, David Schinazi wrote:
> Hi,
> 
> Earlier today at IETF, I discussed this topic with Stuart Cheshire, the
> creator of mDNS. From his perspective, implementing a simpler option in
> musl makes a lot of sense. Even though querying mDNS on a single interface
> is not comprehensive, it'll work for the majority of uses while minimizing
> implementation complexity. Using the UDP connect() trick to find the
> interface corresponding to the configured resolver and then sending
> multicast only on that interface will work and provide reasonable security
> properties. He recommends using the IP_MULTICAST_IF / IPV6_MULTICAST_IF
> socket options to select an interface, as that's what mDNSResponder does on
> Linux. Additionally, he feels strongly that this should be enabled by
> default, since the whole point of zero-configuration networking was for
> things to work without requiring user configuration.

Again, most of these choices are *not workable*. They have already
been rejected.

If you want this to happen, let's please work on something that has
not been rejected.

1. Why on-by-default was rejected:

musl is not only or even mostly used in a desktop user configuration
where mDNS makes sense. It's used in lots of places where silently
starting (after upgrade) to query other devices on a network and
accepting answers from them is unexpected and hostile behavior. Yes,
"on by default" makes sense on an end-user desktop system connected to
a *private* network. This would be a default of the particular OS
using musl and its network configurator, not of musl itself. (And
AFAIK mDNS is not even "on my default" on Windows unless the connected
network is marked as private.)

2. Why deciding what network to query based on the interface of the
configured resolver is rejected:

In a proper DNSSEC-validating setup, the configured resolver is
127.0.0.1 or ::1. This would disable mDNS entirely, forcing you to
essentially *switch DNSSEC off if you want mDNS*. It was already
explained why this is very bad.

Single-interface query has not been rejected, but I don't see any
reason to limit mDNS to a single interface. That doesn't make it
particularly simpler or anything. If you're able to select the
interface, it's just as easy to allow selecting a reasonable number of
interfaces.

The particular implementation mechanisms we've discussed, including
possibly identifying the interface(s) to send to via where a
particular address would be routed, seem overall good.


> On Sat, Mar 9, 2024 at 9:44 AM David Schinazi <dschinazi.ietf@...il..com>
> wrote:
> 
> >
> >
> > On Fri, Mar 8, 2024 at 2:54 PM Rich Felker <dalias@...c.org> wrote:
> >
> >> On Fri, Mar 08, 2024 at 01:55:18PM -0800, David Schinazi wrote:
> >> > On Fri, Mar 8, 2024 at 12:31 PM Rich Felker <dalias@...c.org> wrote:
> >> >
> >> > > On Fri, Mar 08, 2024 at 11:15:52AM -0800, David Schinazi wrote:
> >> > > > On Fri, Mar 8, 2024 at 5:30 AM Rich Felker <dalias@...c.org> wrote:
> >> > > >
> >> > > > > On Thu, Mar 07, 2024 at 08:47:20PM -0800, David Schinazi wrote:
> >> > > > > > Thanks. How would you feel about the following potential
> >> > > configuration
> >> > > > > > design?
> >> > > > > > * Add a new configuration option "send_mdns_unicast"
> >> > > > > > * When true, use the current behavior
> >> > > > > > * When false, send the query on all non-loopback non-p2p
> >> interfaces
> >> > > > > > * Have send_mdns_unicast default to false
> >> > > > > >
> >> > > > > > I was thinking through how to pick interfaces, looked up what
> >> other
> >> > > mDNS
> >> > > > > > libraries do, and pretty much all of them don't allow
> >> configuring
> >> > > > > > interfaces, whereas Avahi exposes allow-interfaces and
> >> > > deny-interfaces.
> >> > > > > I'm
> >> > > > > > leaning towards not making this configurable to reduce
> >> complexity. I
> >> > > > > think
> >> > > > > > that anyone interested in that level of config is probably using
> >> > > Avahi
> >> > > > > > anyway.
> >> > > > > >
> >> > > > > > Additionally this design has two nice properties: the default
> >> > > behavior is
> >> > > > > > RFC-compliant, and it means that for my use-case I don't need to
> >> > > change
> >> > > > > the
> >> > > > > > config file, which was a big part of my motivation for doing
> >> this
> >> > > inside
> >> > > > > of
> >> > > > > > musl in the first place :-)
> >> > > > >
> >> > > > > As discussed in this thread, I don't think so. The biggest
> >> problems I
> >> > > > > initially brought up were increased information leakage in the
> >> default
> >> > > > > configuration and inability to control where the traffic goes
> >> when you
> >> > > > > do want it on. The above proposal just reverts to the initial,
> >> except
> >> > > > > for providing a way to opt-out.
> >> > > > >
> >> > > > > For the most part, mDNS is very much a "home user, personal
> >> device on
> >> > > > > trusted network" thing. Not only do you not want it to default on
> >> > > > > because a lot of systems will be network servers on networks where
> >> > > > > it's not meaningful (and can be a weakness that aids attackers in
> >> > > > > lateral movement), but you also don't want it on when connected to
> >> > > > > public wifi. For example if you have an open browser tab to
> >> > > > > http://mything.local, and migrate to an untrusted network (with
> >> your
> >> > > > > laptop, tablet, phone, whatever), now your browser will be leaking
> >> > > > > private data (likely at least session auth tokens, maybe more) to
> >> > > > > whoever answers the mDNS query for mything.local.
> >> > > >
> >> > > > That's not quite right. The security properties of mDNS and DNS are
> >> the
> >> > > > same. DNS is inherently insecure, regardless of unicast vs
> >> multicast. If
> >> > > > I'm on a coffee shop Wi-Fi, all my DNS queries are sent in the
> >> clear to
> >> > > > whatever IP address the DHCP server gave me.
> >> > >
> >> > > That's not the case. Connections to non-mDNS hosts are authenticated
> >> > > by TLS with certificates issued on the basis of ownership of the
> >> > > domain name. That's not possible with mDNS hostnames, so they'll
> >> > > either be no-TLS or self-signed certs. That's why the above attack is
> >> > > possible. It was also possible with normal DNS in the bad old days of
> >> > > http://, but that time is long gone.
> >> >
> >> > Apologies for being pedantic, but that's not true. The ability to get
> >> TLS
> >> > certificates for a domain name that you own is a property of the WebPKI,
> >> > not a property of TLS. What you wrote is true, but only in the context
> >> of a
> >> > Web browser with an unmodified root certificate store. The features I
> >> > mentioned above don't use the WebPKI, they have a separate root of
> >> trust.
> >> > For example, some of those Apple features exchange TLS certificates via
> >> an
> >> > out-of-band mechanism such as Apple trusted servers. Another example is
> >> the
> >> > Apple Watch: when you first pair a new Apple Watch with an iPhone, they
> >> > exchange ed25519 public keys. Then any time the watch wants to transfer
> >> a
> >> > large file to/from the phone, it'll connect to Wi-Fi, use mDNS to find
> >> the
> >> > phone, and set up an IKEv2/IPsec tunnel that then protects the exchange.
> >> > It's resilient to any attacks at the mDNS level.
> >> >
> >> > You're absolutely right that the security of Web requests using local
> >> > connectivity is completely broken by the lack of WebPKI certificates for
> >> > those. But sending the DNS query over multicast as opposed to
> >> unencrypted
> >> > unicast to an untrusted DNS server doesn't change the security
> >> properties.
> >> > In your example above, the open tab to http://mything.local will send
> >> that
> >> > query to the recursive resolver - and if that's the one received by DHCP
> >> > then that server can reply with its own address and receive your auth
> >> > tokens. One potential fix here is to configure your resolv.conf to
> >> > localhost and then apply policy in that local resolver. But in practice,
> >> > application developers don't rely on security at that layer, they assume
> >> > that DNS is unsafe and implement encryption in userspace with some out
> >> of
> >> > band trust mechanism.
> >>
> >> My specific example was http://mything.local in a web browser, which
> >> is the way you access lots of mDNS-enabled things in the absence of a
> >> specific software ecosystem like Apple's. Since we're talking about
> >> musl which would be running on Linux or a Linux-syscall-compatible
> >> environment, without Apple apps, I think that's the main way anyone
> >> would be using hypothetical mDNS support. And indeed this is the way
> >> you access many printers, 3D printers, IP cameras, etc.
> >>
> >
> > I have multiple services at home that use HTTP and mDNS to communicate
> > with. But they're built knowing that unencrypted HTTP is unsafe. For
> > example, one of my servers doesn't have any authentication - my browser
> > just uses unauthenticated GETs, POSTs and WebSockets. If I leave the tab
> > open and go to a coffee shop, my browser might send that GET to a server I
> > don't trust but that request won't carry any sensitive information. Another
> > of my servers uses TLS with self-signed certs, so every time I want to
> > communicate with it, I need to click through my browser's "this is unsafe"
> > interstitial to get to the page. If I switch networks, the browser will
> > send me the warning again and I'll know not to click through when I'm not
> > at home. In both of those cases, the security is handled (or not handled at
> > all) at the application layer.
> >
> > Maybe at some point we'll have a good framework for authenticating
> >> this kind of usage with certificates (probably certificate pinning on
> >> first use, with good UX, is the only easy solution),
> >
> >
> > Trust on first use works, or even better there are emerging solutions that
> > leverage codes printed on devices and PAKEs so that a device on the
> > untrusted network can't even hijack the first connection without having
> > access to that code. The leading one for home automation is Matter [1].
> > Coincidentally, it also leverages mDNS for discovery, and doesn't rely on
> > security at the DNS level.
> >
> > [1] https://csa-iot.org/all-solutions/matter/
> >
> > but at present,
> >> mDNS devices on the .local zone get accessed with plain http:// all
> >> the time, and this means it's unsafe to do mDNS on
> >> public/untrusted/hostile networks.
> >>
> >
> > The notion of something being "unsafe" (and security in general) is
> > predicated on the existence of a threat model. It's unsafe to use
> > unencrypted HTTP to your bank when your threat model includes someone on
> > the coffee shop Wi-Fi trying to steal your bank credentials. Conversely,
> > it's safe for me to print to this coffee shop printer if my threat model
> > assumes that I'm ok with the owner of the coffee shop seeing my document.
> > Another example is Chromecast which also uses mDNS: from Chrome on a Linux
> > laptop, I can cast YouTube videos to the TV in this coffee shop. That's
> > safe because I trust the network with the YouTube link I'm telling the TV
> > to play. mDNS is not in and of itself safe or unsafe. It converts
> > names into addresses, and what you do with those addresses can potentially
> > be unsafe.
> >
> > That doesn't mean that every single use of mDNS on untrusted networks is
> > safe. If someone builds a web page that sends valuable secrets over
> > unencrypted HTTP to a .local name, then you have a security problem. But my
> > point is that this security problem needs to be solved at the application
> > layer and not at the DNS layer. That said, I agree that having a way to
> > disable mDNS on a machine is a good idea, because there probably are users
> > out there that are stuck with applications that for some reason decided to
> > rely on DNS being secure.
> >
> > In terms of the tradeoff between usability and security, the default to me
> > lies with default-enabling mDNS on all interfaces as Apple and Avahi do.
> > But this tradeoff is between two metrics that can't be quantified one
> > against the other for all possible uses, so I totally understand if your
> > opinion for musl is that the tradeoff there is different than in other
> > situations. You know your users better than I do.
> >
> > > > So the stack has to deal with
> >> > > > the fact that any DNS response can be spoofed.
> >> > >
> >> > > That's also not possible with DNSSEC, but only helps if you're
> >> > > validating it.
> >> > >
> >> > > > The most widely used
> >> > > > solution is TLS: a successful DNS hijack can prevent you from
> >> accessing a
> >> > > > TLS service, but can't impersonate it. That's true of both mDNS and
> >> > > regular
> >> > > > unicast DNS. As an example, all Apple devices have mDNS enabled on
> >> all
> >> > > > interfaces, with no security impact - the features that rely on it
> >> > > > (AirDrop, AirPlay, contact sharing, etc) all use mTLS to ensure
> >> they're
> >> > > > talking to the right device regardless of the correctness of DNS.
> >> > > (Printing
> >> > > > remains completely insecure, but that's also independent of DNS -
> >> your
> >> > > > coffee shop Wi-Fi access point can attack you at the IP layer too)..
> >> One
> >> > > > might think that DNSSEC could save us here, but it doesn't. DNSSEC
> >> was
> >> > > > unfortunately built with a fundamental design flaw: it requires you
> >> to
> >> > > > trust all resolvers on the path, including recursive resolvers. So
> >> even
> >> > > if
> >> > > > you ask for DNSSEC validation of the DNS records for
> >> www.example.com,
> >> > > your
> >> > > > coffee shop DNS recursive resolver can tell you "I checked, and
> >> > > example.com
> >> > > > does not support DNSSEC, here's the IP address for www.example.com
> >> > > though"
> >> > > > and you have to accept it.
> >> > >
> >> > > This is a completely false but somehow persistent myth about DNSSEC.
> >> > > You cannot lie that a zone does not support DNSSEC. The only way to
> >> > > claim a zone does not support DNSSEC is with a signature chain from
> >> > > the DNS root proving the nonexistence of the DS records for the
> >> > > delegation. Without that, the reply is BOGUS and will be ignored as if
> >> > > there was no reply at all.
> >> >
> >> > I was talking about the case where the recursive resolver does the
> >> > validation, which is what's deployed in practice today. What you wrote
> >> is
> >> > only true if the client does the DNSSEC validation itself. Most clients
> >> > don't do that today, because too many domains are just misconfigured and
> >> > broken. Eric Rescorla (the editor of the TLS RFCs) wrote a great blog
> >> post
> >> > about this:
> >>
> >> The consensus of folks in the stub resolver space (at least glibc+musl
> >> and I would assume the BSDs as well) is that the way you do DNSSEC
> >> validation is by having a validating caching proxy or full recursive
> >> resolver on localhost. Doing validation in the stub resolver is not
> >> viable because it may be static-linked, where it would not be able to
> >> be updated with new algorithms, root-of-trust, etc.
> >
> >
> > No disagreement there. By "client" I meant the client device as a whole,
> > and by "recursive resolver" I meant "the DNS server you got from DHCP".
> > Running a DNSSEC-validating recursive resolver on the client device falls
> > into what I meant by "if the client does the DNSSEC validation itself".
> > Sorry for being unclear.
> >
> >
> >> This is one of the
> >> reasons our go-to response for new functionality wanted in the stub
> >> resolver is "do it in a nameserver on localhost" -- because you
> >> already need that to do DNSSEC.
> >>
> >
> > That makes sense. I wasn't working with the assumption that DNSSEC was a
> > requirement.
> >
> > It really did not sound like you were talking about trusting the
> >> recursive, though. You called it a "fundamental design flaw", which it
> >> is not, and said it requires you to "trust all resolvers on the path",
> >> which it does not. It only requires you to trust the immediate
> >> resolver you are interacting with (and not even that if you put the
> >> validation in the stub resolver, but there are good reasons not to do
> >> that, as above). A pure-proxying server that relies on upstream
> >> recursives can do full DNSSEC validation. Dnsmasq is a canonical
> >> example. I believe systemd-resolvd also does it.
> >>
> >
> > That's fair, and I apologize for overstating my point. I absolutely agree
> > that if you run a validating recursive resolver locally, then the attack I
> > described isn't possible. When DNSSEC was designed, it was intended to be
> > deployed in the model I described, where the validating recursive resolver
> > is not on-device. And that's how it is still mostly deployed today because
> > almost all general-purpose client devices do not validate locally. My
> > mental model is very focused around consumer devices where folks buy them
> > and use them without ever changing default settings. That might be a
> > portion of musl users, but you clearly also have advanced users that do
> > things differently.
> >
> > > > > Regarding untrusted networks, one thing I hadn't considered yet is
> >> > > > > that a network configurator probably needs a way to setup
> >> resolv.conf
> >> > > > > such that .local queries temp-fail rather than perma-fail (as they
> >> > > > > would if you just sent the query to public dns) to use during
> >> certain
> >> > > > > race windows while switching networks. IOW "send .local queries to
> >> > > > > configured nameservers" and "treat .local specially but with an
> >> empty
> >> > > > > list of interfaces to send to" should be distinct configurations..
> >> > > >
> >> > > > Yeah, caching negative results in DNS has been a tricky thing from
> >> the
> >> > > > start. You probably could hack something by installing a fake SOA
> >> record
> >> > > > for .local. in your recursive resolver running on localhost. But the
> >> > > > RFC-compliant answer is for stub resolvers to treat it specially
> >> and know
> >> > > > that those often never get an answer (musl doesn't cache DNS
> >> results so
> >> > > in
> >> > > > a way we're avoiding this problem altogether at the stub resolver)..
> >> > >
> >> > > The problem here is not about caching, just about clients using a
> >> > > response. You want a task (like a browser with open tabs) trying to
> >> > > contact the site to get a tempfail rather than NxDomain which might
> >> > > make it stop trying. But you probably want NxDomain if mDNS has been
> >> > > disabled entirely, so that every .local lookup doesn't hang 5 seconds
> >> > > or whatever before saying "inconclusive".
> >> >
> >> > I'm assuming that by tempfail you mean EAI_AGAIN. The two browsers that
> >> > I've written code in don't use that (Chrome just treats it the same as a
> >> > resolution failure and will automatically refresh the tab on a network
> >> > change; Safari doesn't use getaddrinfo and instead relies on an
> >> > asynchronous DNS API that adds results as they come in - I wrote that
> >> > algorithm up in RFC 8305). All that said, synchronous blocking APIs like
> >> > getaddrinfo need to eventually return even if no one replies, so
> >> EAI_AGAIN
> >> > makes sense in that case - whereas if .local is blocked by policy then
> >> > immediately returning EAI_NONAME is best.
> >>
> >> Right. Even if applications don't currently distinguish them well,
> >> returning EAI_AGAIN vs EAI_NONAME is meaningful and enables them to do
> >> the right thing.
> >>
> >
> > Agreed.
> >
> > Thinking back to our discussion about whether to disable mDNS when the
> > resolver is on localhost. I still agree that from an ergonomics
> > perspective, using configs to mean multiple things isn't great. But
> > focusing just on the security properties for a second: if resolv.conf is
> > configured to an IP address that is routed over a given non-loopback
> > interface, the current status quo is to send the .local query unsecured
> > over that interface. So if we were to, in that specific scenario, instead
> > send the query over multicast, but only on that interface - then we
> > wouldn't measurably change the security properties of the system. In
> > practice there is a slight difference where now you can be attacked by any
> > device on the network as opposed to only by the router on that network, but
> > I'd argue that there's no meaningful threat model that distinguishes
> > between those two attacks. So that would be a safe default option. But
> > again, your points about least surprise are still valid, so if you object
> > to that on those grounds I can't disagree.
> >
> > David
> >

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.