Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a3b12a6a-a167-43b4-b99f-b9b8210d7c6f@gmail.com>
Date: Sat, 27 Jul 2024 00:40:42 +0200
From: Gabriel Ravier <gabravier@...il.com>
To: libc-coord@...ts.openwall.com, Carlos O'Donell <carlos@...hat.com>,
 musl@...ts.openwall.com,
 Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>,
 enh <enh@...gle.com>
Cc: libc-alpha@...rceware.org
Subject: Re: [libc-coord] Re: Re: [libc-coord] Making exit actually
 thread-safe

On 7/26/24 11:00 PM, Carlos O'Donell wrote:
> On 7/25/24 8:48 AM, Adhemerval Zanella Netto wrote:
>>
>>
>> On 25/07/24 09:39, enh wrote:
>>> On Wed, Jul 24, 2024 at 8:19 PM Rich Felker <dalias@...c.org> wrote:
>>>>
>>>> On Wed, Jul 24, 2024 at 05:21:00PM -0400, enh wrote:
>>>>> you didn't want to go with the recursive mutex variant mentioned? i'm
>>>>> convinced by this change for Android too, but was leaning towards the
>>>>> recursive mutex myself...
>>>>
>>>> The change I'm advocating for first is a minimal one, just making
>>>> calls from other threads well-defined by blocking until the process
>>>> terminates. This is a trivial change that any implementation can adopt
>>>> without breaking anything else, and doesn't have any potential
>>>> far-reaching consequences.
>>>>
>>>> While some implementations may want to allow (or feel they already
>>>> allow, often by accident)
>>>
>>> yeah, i think that was what made me lean toward the recursive mutex
>>> --- the assumption that _that's_ the option least likely to break
>>> anyone. (i actually thought that was the point of even mentioning it
>>> in the proposal --- the assumption that someone somewhere has an
>>> atexit() handler that calls exit(). normally at this point i'd say "if
>>> i've learned one thing in a decade+ of dealing with Android's libc and
>>> the third-party binary app ecosystem, it's that no matter how crazy a
>>> thing, if you can imagine it, someone's relying on it already", but
>>> since "exiting" isn't really a thing on Android -- you're either
>>> backgrounded or kill -9'ed, and don't typically have any kind of
>>> "quit" functionality yourself -- this is one place where it seems
>>> relatively unlikely.)
>>>
>>>> recursive calls to exit, imposing a
>>>> requirement to do this without a deep dive into where that might lead
>>>> seems like a bad idea to me. Even if it is desirable, it's something
>>>> that could be considered separately without having the thread-safety
>>>> issue blocked on it.
>>>>
>>>> By leaving the recursive case undefined as it was before, any
>>>> implementations that want to do that or keep doing that are free to do
>>>> so.
>>>
>>> aye, but a program that calls exit() from an atexit() handler is
>>> working for me right now on Android, glibc, and macOS. so there's a
>>> user-visible behavior change here for any of those libcs that goes
>>> with a non-recursive mutex. (i think the same is true for musl too,
>>> but don't have a musl-based system to test on.)
>>
>> I think it is reasonable to not add the constraint to allow recursive
>> exit, although making this implementation defined will most likely
>> pressure to eventually have the resolution on the most used behavior
>> (unless it is broken by design).
>>
>> At least for glibc, my plan is to keep current support of allowing it
>> so mostly likely we will use a recursive mutex.
> 
> I agree, it's the most conservative thing. I just reviewed your patch on libc-alpha.
> Thanks for working on that.
> 
> I'm *tempted* to try adding an abort() in the case of atexit() calling exit() and
> running a Fedora mass-rebuild to see what triggers :}
> 

An example I can give, though I have no idea whether it would trigger 
during a Fedora mass-rebuild, is that GNU findutils's xargs 
implementation will, upon seeing several processes exit with status 255 
while running in parallel mode (with `-P n` where `n > 0`), call exit() 
from an atexit() handler.

That is, I'm pretty sure that on Fedora (or any other system with GNU 
findutils), with your proposed addition of abort(), running the command:


echo $(seq 1 100) | xargs -P5 -n1 sh -c 'sleep "$0"; echo "$0"; exit 255'


would result in abort() being called. (note: 
https://savannah.gnu.org/bugs/?func=detailitem&item_id=64451 touches on 
this issue, with the exact behavior on this having changed from GNU 
findutils 4.9.0 to its latest git, but it still calls exit() twice)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.