Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180905030932.GG1878@brightrain.aerifal.cx>
Date: Tue, 4 Sep 2018 23:09:32 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Planned robust mutex internals changes

On Fri, Aug 31, 2018 at 12:45:12PM -0400, Rich Felker wrote:
> Recent debugging with Adelie Linux & POSIX conformance tests gave me a
> bit of a scare about the safety of the current robust mutex
> implementation. It turned out the be completely unrelated (kernel
> problem) but it did remind me that there's some mildly sketchy stuff
> going on right now.
> 
> In order to implement ENOTRECOVERABLE, the current implementation
> changes a bit of the mutex type field to indicate that it's recovered
> after EOWNERDEAD and will go into ENOTRECOVERABLE state if
> pthread_mutex_consistent is not called before unlocking. While it's
> only the thread that holds the lock that needs access to this
> information (except possibly for the same of pthread_mutex_consistent
> choosing between EINVAL and EPERM for erroneous calls), the change to
> the type field is formally a data race with all other threads that
> perform any operation on the mutex. No individual bits race, and no
> write races are possible, so things are "okay" in some sense, but it's
> still not good.
> 
> What I plan to do is move this state to the mutex owner/lock field,
> which should be the only field of the object (aside from waiter count)
> that's mutable.
> 
> Currently, the lock field looks like this:
> 
> bits 0-29:  owner; 0 if unlocked or if owned died; -1 if unrecoverable
> bit 30:     owner died flag, also set if unrecoverable; otherwise 0
> bit 31:     new waiters flag
> 
> Ignoring bit 31, this means the possible values are 0 (unlocked), a
> tid for the owner, 0x40000000 after owner died, and 0x7fffffff when
> unrecoverable.
> 
> What I'd like too do is use 0x40000000|tid as the value for when the
> lock is held but pthread_mutex_consistent hasn't yet been called. Note
> that at the kernel level, bit 29 (0x20000000) is reserved not to be
> used as part of a tid, so this does not overlap with the special value
> 0x7fffffff. And there's still some room for extensibility, since we
> could use bit 31 along with values that don't indicate a mutex state
> that can be waited on.
> 
> With these changes, no fields of mutex object will be mutable except
> for the lock state (owner, futex) and the waiter count.
> 
> Anyway I'm posting this now as notes/reminder and for comments in case
> others see a problem with the proposal or ideas for improvement. I'll
> probably hold off on doing any of this until after release.

And here's the proposed patch.

Rich

View attachment "enotrecoverable2.diff" of type "text/plain" (3897 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.