Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <0165A67A-873B-4662-A1AF-F5B133204F50@oracle.com>
Date: Tue, 18 Aug 2015 18:44:46 -0700
From: Chuck Lever <chuck.lever@...cle.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: Re: nfs-utils broken with musl: "select: Bad file descriptor"


On Aug 18, 2015, at 6:24 PM, Rich Felker <dalias@...c.org> wrote:

> On Tue, Aug 18, 2015 at 06:05:01PM -0700, Chuck Lever wrote:
>>>> i think this call goes wrong:
>>>> 
>>>> 
>>> http://git.linux-nfs.org/?p=steved/nfs-utils.git;a=blob;f=utils/statd/rmtcall.c;hb=HEAD#l56
>>> 
>>>> 
>>>> it loops for 100 iterations and if all ports are used
>>>> according to getservbyport then it FD_SET(sockfd, &SVC_FDSET);
>>>> with some random high sockfd (eg. 105) that is closed.
>>>> 
>>>> ...so should getservbyport fail there?
>>>> 
>>>> (according to strace it tries ports 883 to 982)
>>> 
>>> I think the application's expectation is that it fail rather than
>>> returning a decimal-string-only service entity. However it looks like
>>> the code is written to handle the case where all 100 iterations fail
>>> to get an anonymous port. The problem seems to be that, when the loop
>>> stops due to hitting the iteration count rather than exiting with
>>> break, i has already been incremented past the last tmp_socket slot,
>>> so the close loop closes the fd that they actually want to use, later
>>> causing EBADF. This is purely an application bug, but it happens not
>>> to get noticed if getservbyport fails anywhere along the way, which
>>> they expect to happen in the usual case.
>> 
>> statd_get_socket() is hunting for a privileged source port that
>> is not just unused at the moment, but that is also not going to be
>> used by some other well-known service. This is a long-lived socket
>> that statd uses to communicate with the kernel. It must use a
>> privileged port.
>> 
>> if getservbyport(3) is returning something for every port that
>> is tried, then statd_get_socket() will fail to find a usable
>> port.
>> 
>> If it's returning 105, that suggests it has run out of retries.
>> It should return -1 in this case. That is a logic bug.
>> 
>> But is it true that every port returned by bindresvport(3) is
>> actually defined in /etc/services? Surely there is one open
>> port that can be used. What port does bindresvport(3) start
>> with?

> The logic bug is the count-down loop that closes all the temp sockets.
> In the case where the loop terminates via break, it leaves the last
> one open and only closes the extras. But in the case where where the
> loop terminates via the end condition in the for statement, the close
> loop closes all the sockets including the one it intends to use.

OK. Do you have a patch?

Still not clear why it would take 100 tries exactly.


--
Chuck Lever



Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.