Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20141217070033.GP4574@brightrain.aerifal.cx>
Date: Wed, 17 Dec 2014 02:00:33 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: mDNS and alternate hostname database backends

On Mon, Dec 15, 2014 at 02:39:46AM -0800, Brad Conroy wrote:
> I've been looking into using a simplified DNS caching mechanism using the file-
> system as the "database" and came across this from the wiki:
> 
> > The inability to use mDNS (a multicast-DNS-based zero config system) with musl
> > has been raised as an issue by users in the past. On glibc, using mDNS is
> > accomplished with NSS; obviously musl does not have (or want) NSS.
> >
> > In principle, however, musl is fully extensible to use alternate hostname
> > database backends in place of normal DNS. All that's needed is a daemon that
> > runs on localhost, speaks DNS, and translates the requests to whatever backend
> > is needed. However it's unclear whether there are any existing tools of this
> > form. Developing one, adapting an existing DNS proxy program, or documenting
> > how to setup an existing program that's already capable could be a nice future
> > project.
> 
> My idea is much simpler: store the data as file name by the hostname (in /tmp ?):

/tmp is most certainly the wrong place for anything like this. The
only thing that's valid to create in /tmp is random filenames, with
proper mechanisms (e.g. O_EXCL or mkdir) to avoid collisions. This is
because it's a shared namespace and anyone can create things there. A
malicious user could drop a file named /tmp/hosts before you mkdir it,
or mkdir their own directory with their own malicious entries in it.

Presumably you want something under /var/ in a directory owned by
whoever manages it, possibly with a symlink from /etc/.

> /tmp/hosts/a for ipv4  (limit to 15/host so they can be stored in the inode*)
> /tmp/hosts/aaaa for ipv6 (limit to 3/host *)
> * of the filesystems capable of inlining data, ext4 has the lowest at 60 bytes.
> This means we can just read/write an array of uint32_t for ipv4 and uint128_t
> with something like:
> 
> 	static int get_value(const char *path, void *buf,size_t len){
> 		int fd = open(path, O_RDONLY);
> 		if (fd<0) return fd;
> 		len=read(fd,buf,len);
> 		close(fd);
> 		return len;
> 	}
> 	static int set_value(const char *path, void *buf,size_t len){
> 		int fd = open(path, O_CREAT|O_WRONLY|O_TRUNC);
> 		if (fd<0) return fd;
> 		len=write(fd,buf,len);
> 		close(fd);
> 		return len;
> 	}
> 
> The existing systems /etc/hosts* don't account for TTL, but using the filesystem
> we can hack this feature pretty simply using the filesystem by adding the TTL
> to the modification time.
> 
> 	struct utimbuf ut={.actime=st.st_atime, .modtime=ttl+st.st_mtime};
> 	utime(path,&ut);
> 
> Note: I chose mod time for TTL since a file system may be mounted noatime
> 
> initilization:
> if /tmp/hosts/a (or aaaa for ipv6) does not exist
>    1. mkdir
>    2. read in /etc/hosts to our format
>        a.) for 0.0.0.0 and 127.0.0.1 and their mathing ipv6 counterparts :: and ::1,
>             create a hard link to NULL and localhost
>        b.)similarly create hard links for aliases. for example:
> 
> /etc/hosts|  74.125.225.134 www.google.com google.com www.bing.com bing.com
> 
> /tmp/hosts/a/www.google.com will contain a uint32_t representing 74.125.225.134
> with a modification time set to INT_MAX
>   /tmp/hosts/a/google.com  --hardlink--> /tmp/hosts/a/www.google.com
>   /tmp/hosts/a/www.bing.com --hardlink--> /tmp/hosts/a/www.google.com
>   /tmp/hosts/a/bing.com --hardlink--> /tmp/hosts/a/www.google.com
> 
> This seems pretty over-simplified, but it opens up some possibilities:
> 
>  1. the network functions could be much smaller and rely on a single binary to do
>      all of the hard work in a unix style.  Before anyone argues that starting an
>      external program takes too long, I must point out that this is typically
>      insignificant compared to DNS query/response time and that keeping this
>      functionality internal to the libc requires making certain tradeoffs to keep
>      the overall code size and complexity down.  Other functions already call
>      /bin/sh IIRC, so this isn't a huge leap. ... though all code _could_ stay in
>      the libc if there is a good argument for it.
> 2.  sharing caches between clients now becomes as easy as using rsync or
>      even tar or cpio.
> 3.  A cron task can replace a running daemon to periodically clean up the cache
>      If disk space is low, it can purge or it could even systematically recheck the
>      DNS, update the TTL and even ping all the entries to get a response time so
>      it can sort them from fastest to slowest entries 
> 4.  Ad-blocking can be as simple as:
>        cd /tmp/hosts/a;
>        ln NULL pagead2.googlesyndication.com
> 5.  Filtering can also be accomplish using standard users/groups.
>      blacklist filtering by making them hardlinks to NULL
>      whitelist filtering by making /tmp/hosts read only
> 6.  Because the cache is so simple, integrating it to work with other caching
>      methods like nss/nscd, libresolvconf, dnsmasq, djbdns and bind _should_
>      be fairly straight forward.
> 
> I've been working on my own rudimentary implementation to include with my
> own libc.h headers, (only for small, single file static apps) but I primarily use
> musl, so I'd be interested in hearing any feedback, especially if there is a
> possibility that it could become a standard practice.  My guess is that it has
> probably already been done by Bell Labs for Plan.

While this may be a good system, musl isn't really in the business of
imposing policy unless there's strong existing precedent. On the other
hand, the beauty of the "just run a daemon speaking DNS protocol" is
that you can write an utterly trivial daemon that serves records
stored in the above form over DNS protocol, and have musl's resolver
(or any other resolver) return these records with no code changes
whatsoever.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.