Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87wme4hebo.fsf@oldenburg.str.redhat.com>
Date: Wed, 05 Feb 2025 11:32:27 +0100
From: Florian Weimer <fweimer@...hat.com>
To: Daniele Personal <d.dario76@...il.com>
Cc: Rich Felker <dalias@...c.org>,  musl@...ts.openwall.com
Subject: Re: pthread_mutex_t shared between processes with different
 pid namespaces

* Daniele Personal:

> On Tue, 2025-02-04 at 13:53 -0500, Rich Felker wrote:
>> On Mon, Feb 03, 2025 at 06:25:41PM +0100, Florian Weimer wrote:
>> > * Daniele Personal:
>> > 
>> > > On Sat, 2025-02-01 at 17:03 +0100, Florian Weimer wrote:
>> > > > * Daniele Personal:
>> > > > 
>> > > > > > Is this required for implementing the unlock-if-not-owner
>> > > > > > error
>> > > > > > code
>> > > > > > on mutex unlock?
>> > > > > 
>> > > > > No, I don't see problems related to EOWNERDEAD.
>> > > > 
>> > > > Sorry, what I meant is that the TID is needed for efficient
>> > > > reporting
>> > > > of
>> > > > usage errors.  It's not imposed by the robust list protocol as
>> > > > such..
>> > > > There could be a PID-namespace-compatible robust mutex type
>> > > > that does
>> > > > not have this problem (but with less error checking).
>> > > > 
>> > > > Thanks,
>> > > > Florian
>> > > > 
>> > > 
>> > > Are you saying that there are pthread_mutexes which can be shared
>> > > across processes run on different pid namespaces? If yes I'm
>> > > definitely
>> > > interested on this. Can you tell me something more?
>> > 
>> > You would have to add a new mutex type that is a mix of
>> > PTHREAD_MUTEX_NORMAL amd PTHREAD_MUTEX_ROBUST.  Closer to the
>> > latter,
>> > but without the ownership checks.
>> 
>> This is inaccurate. Robust mutexes fundamentally depend on having the
>> owner's tid in the owner field, and on this value not matching the
>> tid of any other task that might hold the mutex. If these properties
>> don't hold, the mutex may fail to unlock when the owner dies, or
>> incorrectly unlock when another task mimicking the owner dies.
>> 
>> The Linux robust mutex protocol fundamentally does not work across
>> pid namespaces.

Thank you, Rich, for the correction.

> Looking at the code for musl 1.2.4, a pthread_mutex_t which has been
> initialized as shared and robust but not PI capable leaves uncovered
> only the case of pthread_mutex_unlock().

> As mentioned by Rich, since TIDs are not unique across different
> namespaces, a task might unlock a mutex hold by another one if they
> have the same TID.
>
> I don't see other possible errors, am I missing something?

The kernel code uses the owner TID to handle some special cases:

	/*
	 * Special case for regular (non PI) futexes. The unlock path in
	 * user space has two race scenarios:
	 *
	 * 1. The unlock path releases the user space futex value and
	 *    before it can execute the futex() syscall to wake up
	 *    waiters it is killed.
	 *
	 * 2. A woken up waiter is killed before it can acquire the
	 *    futex in user space.
	 *
	 * In the second case, the wake up notification could be generated
	 * by the unlock path in user space after setting the futex value
	 * to zero or by the kernel after setting the OWNER_DIED bit below.
	 *
	 * In both cases the TID validation below prevents a wakeup of
	 * potential waiters which can cause these waiters to block
	 * forever.
	 *
	 * In both cases the following conditions are met:
	 *
	 *	1) task->robust_list->list_op_pending != NULL
	 *	   @pending_op == true
	 *	2) The owner part of user space futex value == 0
	 *	3) Regular futex: @pi == false
	 *
	 * If these conditions are met, it is safe to attempt waking up a
	 * potential waiter without touching the user space futex value and
	 * trying to set the OWNER_DIED bit. If the futex value is zero,
	 * the rest of the user space mutex state is consistent, so a woken
	 * waiter will just take over the uncontended futex. Setting the
	 * OWNER_DIED bit would create inconsistent state and malfunction
	 * of the user space owner died handling. Otherwise, the OWNER_DIED
	 * bit is already set, and the woken waiter is expected to deal with
	 * this.
	 */
	owner = uval & FUTEX_TID_MASK;

	if (pending_op && !pi && !owner) {
		futex_wake(uaddr, FLAGS_SIZE_32 | FLAGS_SHARED, 1,
			   FUTEX_BITSET_MATCH_ANY);
		return 0;
	}

As a result, it's definitely just a userspace-only change if you need to
use the robust mutex list across PID namespaces.

Thanks,
Florian

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.