Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140408111147.5f79729f@ncopa-desktop.alpinelinux.org>
Date: Tue, 8 Apr 2014 11:11:47 +0200
From: Natanael Copa <ncopa@...inelinux.org>
To: musl@...ts.openwall.com
Subject: if_nameindex/getifaddrs and dhcpcd issue

Hi,

When testing migrating from uclibc to musl libc in a virtual machine I
ended up with no working network. I found out that dhcpcd[1] simply
exited with error and added this to the syslog:

 daemon.err dhcpcd[2000]: eth0: interface not found or invalid

This doesn't happen with uclibc (or glibc I assume) so I tried to dig
up what happens.

In this case, busybox ifup calls: 'dhcpcd eth0'.

You can also run dhcpcd in a catch-all mode where it will autoconfig
all network interfaces, includng new, hotplugged ones, but in this case
it was called with a specified interface.

So I grepped dhcpcd source for the "interface not found or invalid" and
found it in dhcpcd.c[2]


        /* Start any dev listening plugin which may want to
         * change the interface name provided by the kernel */
        if ((ctx.options & (DHCPCD_MASTER | DHCPCD_DEV)) ==
            (DHCPCD_MASTER | DHCPCD_DEV))
                dev_start(&ctx);

        ctx.ifaces = discover_interfaces(&ctx, ctx.ifc, ctx.ifv);
        for (i = 0; i < ctx.ifc; i++) {
                if (find_interface(&ctx, ctx.ifv[i]) == NULL)
                        syslog(LOG_ERR, "%s: interface not found or invalid",
                            ctx.ifv[i]);
        }

After some debugging it turns out that eth0 is not detected by
discover_interfaces, and in fact, dhcpcd will not discover any
interfaces at all. 

So I took a look at discover_interfaces, found in net.c[3].

struct if_head *
discover_interfaces(struct dhcpcd_ctx *ctx, int argc, char * const *argv)
{
	struct ifaddrs *ifaddrs, *ifa;
[SNIP]...
#elif AF_PACKET
	const struct sockaddr_ll *sll;
#endif

	if (getifaddrs(&ifaddrs) == -1)
		return NULL;
	ifs = malloc(sizeof(*ifs));
	if (ifs == NULL)
		return NULL;
	TAILQ_INIT(ifs);

kk

		/* Ensure that the interface name has settled */
		if (!dev_initialized(ctx, ifa->ifa_name))
			continue;


So, dhcpcd uses getifaddrs to detect available interfaces, if the
ifa_addr is set (non-null) it will filter out those who has not
AF_PACKET family. (why? i don't know).

At this point i got curious about the getifaddrs implementation and
wrote a small testcase[4]:
/*-----8<-------------------------------------------------*/
#include <stdio.h>
#include <ifaddrs.h>

#include <sys/types.h>

int main(void)
{
        struct ifaddrs *ifap, *ifa;
        if (getifaddrs(&ifap) == -1) {
                perror("getifaddrs");
                return 1;
        }
        for (ifa = ifap; ifa != NULL; ifa = ifa->ifa_next) {
                const char *afstr = "none";
                if (ifa->ifa_addr != NULL) {
                        switch (ifa->ifa_addr->sa_family) {
                        case AF_PACKET:
                                afstr = "AF_PACKET";
                                break;
                        case AF_INET:
                                afstr = "AF_INET";
                                break;
                        case AF_INET6:
                                afstr = "AF_INET6";
                                break;
                        default:
                                afstr = "unknown";
                        }
                }
                printf("%s: %s\n", ifa->ifa_name, afstr);
        }
        return 0;
}
/*-----8<-------------------------------------------------*/

I noticed that it will only print interfaces that has an AF_INET
address configured:
lo: AF_INET
eth0: AF_INET

Try 'modprobe dummy' to create a dummy0. It will not get listed. This
is kinda okish, since getifaddrs is about getting the configured
addresses. I checked the manpage[5]:
CONFORMING TO         top

       Not in POSIX.1-2001.  This function first appeared in BSDi and is
       present on the BSD systems, but with slightly different semantics
       documented—returning one entry per interface, not per address.  This
       means ifa_addr and other fields can actually be NULL if the interface
       has no address, and no link-level address is returned if the
       interface has an IP address assigned.  Also, the way of choosing
       either ifa_broadaddr or ifa_dstaddr differs on various systems.

NOTES         top

       The addresses returned on Linux will usually be the IPv4 and IPv6
       addresses assigned to the interface, but also one AF_PACKET address
       per interface containing lower-level details about the interface and
       its physical layer.  In this case, the ifa_data field may contain a
       pointer to a struct rtnl_link_stats, defined in <linux/if_link.h> (in
       Linux 2.4 and earlier, struct net_device_stats, defined in
       <linux/netdevice.h>), which contains various interface attributes and
       statistics.

So this is what application programmers will expect, the existance of
AF_PACKET on linux and this is what Roy Marples (dhcpcd author)
expected.

So I got curious why musl doesn't show AF_PACKET.

http://git.musl-libc.org/cgit/musl/tree/src/network/getifaddrs.c#n116

First thing it does is call if_nameindex, and now I discovered that
if_nameindex does not do what would you would think it does. I would
have expected that if_nameindex returns *all* interfaces as posix says
it does, but it actually only returns interfaces with a configured
AF_INET address.

Testcase copied from if_nameindex manpage[6]:
/*-----8<-------------------------------------------------*/
#include <net/if.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int
main(int argc, char *argv[])
{
    struct if_nameindex *if_ni, *i;

   if_ni = if_nameindex();
    if (if_ni == NULL) {
        perror("if_nameindex");
        exit(EXIT_FAILURE);
    }

   for (i = if_ni; ! (i->if_index == 0 && i->if_name == NULL); i++)
        printf("%u: %s\n", i->if_index, i->if_name);

   if_freenameindex(if_ni);

   exit(EXIT_SUCCESS);
}
/*-----8<-------------------------------------------------*/

You will notice that it does not print dummy0 interface (modprobe dummy
first) unless you assign it an ipv4 address. It will not list it with
only ipv6 addr assigned.

At this point I thought, that we do have a real bug in musl and
wondered what would happen with dhcpcd if we fixed the if_nameindex bug.

I think what will happen is that getifaddrs will list the interface
without any address, but with ifa_addr set to null. The dhcpcd code
looked like:

	for (ifa = ifaddrs; ifa; ifa = ifa->ifa_next) {
		if (ifa->ifa_addr != NULL) {
#ifdef AF_LINK
			if (ifa->ifa_addr->sa_family != AF_LINK)
				continue;
#elif AF_PACKET
			if (ifa->ifa_addr->sa_family != AF_PACKET)
				continue;
#endif
		}


Since ifa_addr is NULL for unconfigured interfaces, it would not go to
the AF_PACKET check and the interface will not be skipped - unless it
already has a configured addr, either ipv4 or ipv6.

So I implemented an if_nameindex which parses the /proc/net/dev and the
getifaddr testcase[4] started to show "none" unless it has an ipv4 addr.

Since I am fairly eager to switch Alpine Linux development branch to
musl, I thought this would be an acceptable (temp?) solution to move
forward: Fix what I believe is a bug in if_nameindex and dhcpcd users
will be able to reboot their machines without completely losing network.

I don't like the idea of parsing /proc/net/dev for various reasons. It
will not work without a mounted /proc, it will pull in stdio (I believe
malloc is already needed) and it is ugly. But apparently, this is the
only way to get a list of interfaces in Linux without using netlink.

In my humble opinion, netlink would be a saner way to solve this. (yeah
netlink has its own set of problems...)

I also believe that the only sane way to grab network address
information on Linux (getifaddrs) nowdays is via netlink. I find the
current way to deal with ipv6 via /proc ugly too.

I can post a patch for an if_nameindex implementation that
parses /proc/net/dev if you want, but I consider it more of a temp hack
than a real solution.

I wonder how many other applications that will break due to unexpected
getifaddrs behavior...

-nc

PS. I had similar issues with uclibc some time ago and for uclibc the
solution was getifaddrs with netlink.
https://www.mail-archive.com/uclibc@uclibc.org/msg00933.html

--
[1] http://roy.marples.name/projects/dhcpcd/index
[2] http://roy.marples.name/projects/dhcpcd/artifact/9e50f49288d544dd
[3] http://roy.marples.name/projects/dhcpcd/artifact/bab4c52c9c23d06c
[4] http://dev.alpinelinux.org/~ncopa/musl/testcase/ifaddr.c
[5] http://man7.org/linux/man-pages/man3/getifaddrs.3.html#CONFORMING_TO
[6] http://man7.org/linux/man-pages/man3/if_nameindex.3.html#EXAMPLE

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.