![]() |
|
Message-ID: <CANk+eUU+-dDso_OnmkVQO74S1ekX50UKFznihNmKhAcJ9hMSwQ@mail.gmail.com>
Date: Sat, 8 Feb 2025 10:20:45 +0100
From: Daniele Dario <d.dario76@...il.com>
To: Rich Felker <dalias@...c.org>
Cc: Florian Weimer <fweimer@...hat.com>, musl@...ts.openwall.com
Subject: Re: pthread_mutex_t shared between processes with different
pid namespaces
But wouldn't this mean that robust mutexes functionality is totally
incompatible with pid namespaces?
If the kernel relies on tid stored in memory by the process this always
lacks the information about the pid namespace the tid belongs to.
Daniele.
Il giorno ven 7 feb 2025 alle ore 17:19 Rich Felker <dalias@...c.org> ha
scritto:
> On Thu, Feb 06, 2025 at 08:45:14AM +0100, Daniele Personal wrote:
> > On Wed, 2025-02-05 at 11:32 +0100, Florian Weimer wrote:
> > > * Daniele Personal:
> > >
> > > > On Tue, 2025-02-04 at 13:53 -0500, Rich Felker wrote:
> > > > > On Mon, Feb 03, 2025 at 06:25:41PM +0100, Florian Weimer wrote:
> > > > > > * Daniele Personal:
> > > > > >
> > > > > > > On Sat, 2025-02-01 at 17:03 +0100, Florian Weimer wrote:
> > > > > > > > * Daniele Personal:
> > > > > > > >
> > > > > > > > > > Is this required for implementing the unlock-if-not-
> > > > > > > > > > owner
> > > > > > > > > > error
> > > > > > > > > > code
> > > > > > > > > > on mutex unlock?
> > > > > > > > >
> > > > > > > > > No, I don't see problems related to EOWNERDEAD.
> > > > > > > >
> > > > > > > > Sorry, what I meant is that the TID is needed for efficient
> > > > > > > > reporting
> > > > > > > > of
> > > > > > > > usage errors. It's not imposed by the robust list protocol
> > > > > > > > as
> > > > > > > > such..
> > > > > > > > There could be a PID-namespace-compatible robust mutex type
> > > > > > > > that does
> > > > > > > > not have this problem (but with less error checking).
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Florian
> > > > > > > >
> > > > > > >
> > > > > > > Are you saying that there are pthread_mutexes which can be
> > > > > > > shared
> > > > > > > across processes run on different pid namespaces? If yes I'm
> > > > > > > definitely
> > > > > > > interested on this. Can you tell me something more?
> > > > > >
> > > > > > You would have to add a new mutex type that is a mix of
> > > > > > PTHREAD_MUTEX_NORMAL amd PTHREAD_MUTEX_ROBUST. Closer to the
> > > > > > latter,
> > > > > > but without the ownership checks.
> > > > >
> > > > > This is inaccurate. Robust mutexes fundamentally depend on having
> > > > > the
> > > > > owner's tid in the owner field, and on this value not matching
> > > > > the
> > > > > tid of any other task that might hold the mutex. If these
> > > > > properties
> > > > > don't hold, the mutex may fail to unlock when the owner dies, or
> > > > > incorrectly unlock when another task mimicking the owner dies.
> > > > >
> > > > > The Linux robust mutex protocol fundamentally does not work
> > > > > across
> > > > > pid namespaces.
> > >
> > > Thank you, Rich, for the correction.
> > >
> > > > Looking at the code for musl 1.2.4, a pthread_mutex_t which has
> > > > been
> > > > initialized as shared and robust but not PI capable leaves
> > > > uncovered
> > > > only the case of pthread_mutex_unlock().
> > >
> > > > As mentioned by Rich, since TIDs are not unique across different
> > > > namespaces, a task might unlock a mutex hold by another one if they
> > > > have the same TID.
> > > >
> > > > I don't see other possible errors, am I missing something?
> > >
> > > The kernel code uses the owner TID to handle some special cases:
> > >
> > > /*
> > > * Special case for regular (non PI) futexes. The unlock
> > > path in
> > > * user space has two race scenarios:
> > > *
> > > * 1. The unlock path releases the user space futex value
> > > and
> > > * before it can execute the futex() syscall to wake up
> > > * waiters it is killed.
> > > *
> > > * 2. A woken up waiter is killed before it can acquire the
> > > * futex in user space.
> > > *
> > > * In the second case, the wake up notification could be
> > > generated
> > > * by the unlock path in user space after setting the futex
> > > value
> > > * to zero or by the kernel after setting the OWNER_DIED bit
> > > below.
> > > *
> > > * In both cases the TID validation below prevents a wakeup
> > > of
> > > * potential waiters which can cause these waiters to block
> > > * forever.
> > > *
> > > * In both cases the following conditions are met:
> > > *
> > > * 1) task->robust_list->list_op_pending != NULL
> > > * @pending_op == true
> > > * 2) The owner part of user space futex value == 0
> > > * 3) Regular futex: @pi == false
> > > *
> > > * If these conditions are met, it is safe to attempt waking
> > > up a
> > > * potential waiter without touching the user space futex
> > > value and
> > > * trying to set the OWNER_DIED bit. If the futex value is
> > > zero,
> > > * the rest of the user space mutex state is consistent, so
> > > a woken
> > > * waiter will just take over the uncontended futex. Setting
> > > the
> > > * OWNER_DIED bit would create inconsistent state and
> > > malfunction
> > > * of the user space owner died handling. Otherwise, the
> > > OWNER_DIED
> > > * bit is already set, and the woken waiter is expected to
> > > deal with
> > > * this.
> > > */
> > > owner = uval & FUTEX_TID_MASK;
> > >
> > > if (pending_op && !pi && !owner) {
> > > futex_wake(uaddr, FLAGS_SIZE_32 | FLAGS_SHARED, 1,
> > > FUTEX_BITSET_MATCH_ANY);
> > > return 0;
> > > }
> > >
> > > As a result, it's definitely just a userspace-only change if you need
> > > to
> > > use the robust mutex list across PID namespaces.
> > >
> >
> > I tried to understand what you mean here but can't: can you please
> > explain me which userspace-only change is needed?
>
> No such change is possible. Robust futexes inherently rely on the
> kernel being able to evaluate, on async process death, whether the
> dying task was the owner of a mutex in the robust list. This depends
> on the tid stored in memory being an accurate and unique identifier
> for the task. If you violate this, you can hack things make the
> userspace side work, but the whole robust functionality you want will
> fail to work.
>
> Rich
>
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.