|
Message-Id: <51064AE3-C5CC-4B6D-8098-6088EBE813D2@oracle.com> Date: Tue, 18 Aug 2015 19:13:32 -0700 From: Chuck Lever <chuck.lever@...cle.com> To: Rich Felker <dalias@...c.org> Cc: musl@...ts.openwall.com Subject: Re: Re: nfs-utils broken with musl: "select: Bad file descriptor" On Aug 18, 2015, at 6:51 PM, Rich Felker <dalias@...c.org> wrote: > On Tue, Aug 18, 2015 at 06:44:46PM -0700, Chuck Lever wrote: >> >> On Aug 18, 2015, at 6:24 PM, Rich Felker <dalias@...c.org> wrote: >> >>> On Tue, Aug 18, 2015 at 06:05:01PM -0700, Chuck Lever wrote: >>>>>> i think this call goes wrong: >>>>>> >>>>>> >>>>> http://git.linux-nfs.org/?p=steved/nfs-utils.git;a=blob;f=utils/statd/rmtcall.c;hb=HEAD#l56 >>>>> >>>>>> >>>>>> it loops for 100 iterations and if all ports are used >>>>>> according to getservbyport then it FD_SET(sockfd, &SVC_FDSET); >>>>>> with some random high sockfd (eg. 105) that is closed. >>>>>> >>>>>> ...so should getservbyport fail there? >>>>>> >>>>>> (according to strace it tries ports 883 to 982) >>>>> >>>>> I think the application's expectation is that it fail rather than >>>>> returning a decimal-string-only service entity. However it looks like >>>>> the code is written to handle the case where all 100 iterations fail >>>>> to get an anonymous port. The problem seems to be that, when the loop >>>>> stops due to hitting the iteration count rather than exiting with >>>>> break, i has already been incremented past the last tmp_socket slot, >>>>> so the close loop closes the fd that they actually want to use, later >>>>> causing EBADF. This is purely an application bug, but it happens not >>>>> to get noticed if getservbyport fails anywhere along the way, which >>>>> they expect to happen in the usual case. >>>> >>>> statd_get_socket() is hunting for a privileged source port that >>>> is not just unused at the moment, but that is also not going to be >>>> used by some other well-known service. This is a long-lived socket >>>> that statd uses to communicate with the kernel. It must use a >>>> privileged port. >>>> >>>> if getservbyport(3) is returning something for every port that >>>> is tried, then statd_get_socket() will fail to find a usable >>>> port. >>>> >>>> If it's returning 105, that suggests it has run out of retries. >>>> It should return -1 in this case. That is a logic bug. >>>> >>>> But is it true that every port returned by bindresvport(3) is >>>> actually defined in /etc/services? Surely there is one open >>>> port that can be used. What port does bindresvport(3) start >>>> with? >> >>> The logic bug is the count-down loop that closes all the temp sockets. >>> In the case where the loop terminates via break, it leaves the last >>> one open and only closes the extras. But in the case where where the >>> loop terminates via the end condition in the for statement, the close >>> loop closes all the sockets including the one it intends to use. >> >> OK. Do you have a patch? > > Something like this: > > + if (i == loopcnt) i--; > while (--i >= 0) > close(tmp_sockets[i]); > > I don't have a checkout of the source to make a proper diff, and I've > got too many things I'm trying to do at the moment already. But it > should be easy to make a real patch from what I wrote above. I'll see if I can put together a proper patch for Steve, then. But we're both ooo through at least next week. >> Still not clear why it would take 100 tries exactly. > > Because getservbyport never fails, and the code path with 100 failures > to find a port where getservbyport fails was never tested. We can (and > probably should) make it fail in musl when there's no named service to > go with the port, Ah ha. So musl's getservbyport(3) never returns NULL? http://pubs.opengroup.org/onlinepubs/9699919799/functions/endservent.html suggests that's incorrect behavior. Why wasn't this a problem before now? The port hunting logic hasn't changed since January 2013 (commit eb8229338f). > but that doesn't change that there's a bug in this > previously-untested codepath of nfs-utils that's the source of the > EBADF error from select. Agreed. Just trying to get a full picture. -- Chuck Lever
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.