![]() |
|
Message-ID: <20250209132826.GB5321@brightrain.aerifal.cx> Date: Sun, 9 Feb 2025 08:28:27 -0500 From: Rich Felker <dalias@...c.org> To: libc-coord@...ts.openwall.com Cc: "Zhu, Yifan" <yzhu104@...Rochester.edu>, Matthew Parkinson <mattpark@...rosoft.com> Subject: Re: Re: [EXT] Re: pthread_atfork concurrent semantics On Sat, Feb 08, 2025 at 09:20:18AM +0000, Matthew Parkinson wrote: > Thanks so much for the replies. > > > Can you clarify what kind of synchronization you have in mind whereby > > these events are distinguishably concurrent? > > The synchronisation I am after is for malloc and free from snmalloc > to be fork safe. I am attempting to do this with just > pthread_atfork, and not assuming specific calls like glibc's malloc > uses, or that FreeBSD provides. I also don't want to assume there is > an initialisation function called before any threads are created. > > There are three challenging cases > 1. One thread calls fork, while a parallel thread does the first call on snmalloc. > 2. Same as (1), but there are multiple threads calling into snmalloc > 3. There are other pthread_atfork handlers created that call into > snmalloc, these may be order before or after snmalloc's > pthread_atfork handler. > > My current solution is part of an on going PR: > https://github.com/microsoft/snmalloc/pull/735 > I have attached the file that actually performs the synchronisation. > > It provides an RAII class that can be used to protect any access to > sensitive structures, and will cause the prefork handler to wait > until no threads are inside a sensitive part of snmalloc. > > It uses thread_local state to detect re-entrancy of the RAII > structure, and also to allow other prefork/postfork handlers to > execute operations on snmalloc. > > There is a second part of thread_local state to detect is the > pthread_atfork handlers have been installed multiple times, and only > performing the waiting logic in the first call to the prefork > handler, and the last call to the postfork handler. > > I believe the attached code would function correctly on Musl and > older glibc. > > The issue in my earlier message is extracted from this code based on > scenario (1) above (a single thread calling snmalloc for the first > time in a parallel with a fork). In this instance, it is possible > for snmalloc to assume the pthread_atfork handler will be executed > by any fork, enter my fork protected region, and then acquire some > internal lock. While the parallel fork complete ignores the > pthread_atfork handler, and then the child's version of snmalloc is > deadlocked. > > As Yifan Zhu said snmalloc implements its own locking primitives > with a modified version of the MCS queue lock, and calls into > futex/waitonaddress. OK, my take on this so far: glibc is buggy and has a non-usable pthread_atfork by virtue of pthread_atfork allowing forward progress past its return (thus letting the application take locks that should be precluded while forking) when there is another thread in the process of forking without seeing the newly added atfork handler (and which could not run the new handler even if it did see it, due to ordering requirements). I do think glibc was right to fix it so that application code in atfork handlers doesn't run with a lock held. While POSIX is unclear on the matter, it doesn't state any restriction on calling pthread_atfork from such a context, and generally it's wrong for an implementation to add gratuitous restrictions of this sort. Further analysis on what the right fix should be is probably needed, and I'd like to see this coordinated with the standard. For example, if an atfork handler is added from a prefork handler, since the new prefork handler did not run in the parent, certainly the corresponding postfork handlers should not run either during the same fork, but this isn't specified at present. I suspect the right behavior pthread_atfork will be to defer return until any in-process fork in another thread is finished, so that the caller can't take locks that the new prefork handler would have taken, and that each fork should effectively take a snapshot of the state of atfork handlers before starting, rather than repeatedly locking, examining, and unlocking the list, so that it doesn't inadvertently run a postfork handler for which the corresponding prefork handler didn't run. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.