musl - Re: pthread_mutex_t shared between processes with different pid namespaces

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250210181402.GA10433@brightrain.aerifal.cx>
Date: Mon, 10 Feb 2025 13:14:03 -0500
From: Rich Felker <dalias@...c.org>
To: Daniele Personal <d.dario76@...il.com>
Cc: Florian Weimer <fweimer@...hat.com>, musl@...ts.openwall.com
Subject: Re: pthread_mutex_t shared between processes with different
 pid namespaces

On Mon, Feb 10, 2025 at 05:12:52PM +0100, Daniele Personal wrote:
> On Sat, 2025-02-08 at 09:52 -0500, Rich Felker wrote:
> > On Sat, Feb 08, 2025 at 03:40:18PM +0100, Daniele Dario wrote:
> > > Il sab 8 feb 2025, 13:39 Rich Felker <dalias@...c.org> ha scritto:
> > > 
> > > > On Sat, Feb 08, 2025 at 10:20:45AM +0100, Daniele Dario wrote:
> > > > > But wouldn't this mean that robust mutexes functionality is
> > > > > totally
> > > > > incompatible with pid namespaces?
> > > > 
> > > > No, only with trying to synchronize *across* different pid
> > > > namespaces.
> > > > 
> > > > > If the kernel relies on tid stored in memory by the process
> > > > > this always
> > > > > lacks the information about the pid namespace the tid belongs
> > > > > to.
> > > > 
> > > > It's necessarily within the same pid namespace as the process
> > > > itself.
> > > > 
> > > > Functionally, you should consider different pid namespaces as
> > > > different systems that happen to be capable of sharing some
> > > > resources.
> > > > 
> > > > Rich
> > > > 
> > > 
> > > Yes, I'm just saying that sharing pthread_mutex_t instances across
> > > processes within the same pid namespace but on a system with more
> > > than a
> > > pid namespace could lead to issues anyway if the stored tid value
> > > is used
> > > by the kernel as who to contact without the knowledge of on which
> > > pid
> > > namespace.
> > > 
> > > I not saying this is true, I'm trying to understand and if
> > > possible,
> > > improve things.
> > 
> > That's not a problem. The stored tid is used only in the context of a
> > process exiting, where the kernel code knows the relevant pid
> > namespace (the one the exiting process is in) and uses the tid
> > relative to that. If it didn't work this way, it would be a fatal bug
> > in the pid namespace implementation, which is supposed to allow
> > essentially transparent containerization (which includes processes in
> > the ns being able to use their tids as they could if they were
> > outside
> > of any container/in global ns).
> > 
> > Rich
> > 
> 
> So, IIUC, the problem of sharing robust pthread_mutex_t instances
> across different pid namespaces is on the user space side which is not
> able to distinguish clashes on TIDs. In particular, problems could
> arise when:

No, it is not "on the user side". The user side can be modified
arbitrarily, and, modulo some cost, could surely be made to work for
non-robust process-shared mutexes. The problem is that the kernel --
the part which makes them robust -- has to honor the protocol, and the
protocol does not admit distinguishing "pid N in ns X" from "pid N in
ns Y".

>  * an application tries to unlock a mutex owned by another one with its
>    same TID but on a different pid namespace (but this is an
>    application design problem and libc can't help because TIDs are not
>    unique across different pid namespaces)
>  * an application tries to lock a mutex owned by another one with its
>    same TID but on a different pid namespace: this is a real issue
>    because it could happen
> 
> I know that pid namespace isolation usually comes also with ipc
> namespace isolation but it is not a violation to have one without the
> other. Wouldn't it be a good idea to figure out a way to have a safe
> way to use robust mutexes shared across different pid namespaces?

I do not consider this a reasonable expenditure of complexity
whatsoever. It would require at least having a new robust list
protocol, with userspace having to support both the old and new ones
adapting at runtime, and may even require larger-than-wordsize
atomics, which are not something you can assume exists. All of this
for the explicit purpose of *violating* the whole intended purpose of
namespaces: the isolation.

For cases where you really need cross-ns locking, you could use sysv
semaphores if the sysvipc namespace is shared. If it's not, you could
use fcntl ODF locks on a shared file descriptor, which should have
your needed robustness properties.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.