Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240816155140.GR10433@brightrain.aerifal.cx>
Date: Fri, 16 Aug 2024 11:51:41 -0400
From: Rich Felker <dalias@...c.org>
To: Markus Wichmann <nullplan@....net>
Cc: musl@...ts.openwall.com, Zibin Liu <ghostfly23333@...il.com>
Subject: Re: ptc in pthread

On Fri, Aug 16, 2024 at 04:38:39PM +0200, Markus Wichmann wrote:
> Am Fri, Aug 16, 2024 at 10:51:53AM +0800 schrieb Zibin Liu:
> > Despite this, I’m still unclear on why dlopen needs to ensure that the
> > thread count does not increase. Could someone provide more details on
> > this?
> 
> This is in case a library is opened that contains TLS. In that case, the
> thread calling dlopen() must allocate a new TLS block for the library
> for every thread that currently exists, as well as a new DTV to contain
> the pointers. If a thread could be created during this, obviously there
> could be a thread created without that TLS block.
> 
> musl doesn't use the lazy TLS initialization scheme glibc uses, because
> that one admits no failure. In that scheme, memory for the new TLS is
> allocated in __tls_get_addr(), but if allocation fails, there is no
> choice but to abort. In musl's implementation, the memory is allocated
> in dlopen(), and if it cannot be allocated, the dlopen() fails.
> 
> The lock cannot be reduced in scope to the TLS installation, since each
> library can pull in dependencies that can also have TLS.

I should probably go into a little bit more detail on this.

Since dynamic loading involves dynamic TLS, pthread_create and dlopen
need a contract between them for who is responsible for allocation of
memory for dynamic-loaded modules' TLS.

The way we do this is by making the operations ordered with respect to
each other, via a lock.

When pthread_create happens, it is responsible for allocation of TLS
storage for all modules that existed (as a result of initial program
load or dlopen) prior to the pthread_create call (prior to it taking
the __acquire_ptc lock).

When dlopen happens, it is responsible for allocation of TLS storage
for all threads that existed prior to the dlopen call (prior to it
taking the __inhibit_ptc lock).

One might think dlopen could release the ptc lock earlier once it
finishes loading libaries, before it does the time-costlier relocation
part. However, at this point success of the dlopen isn't committed, so
pthread_create could see a wrong speculative value of the needed tls
size, which ends up getting reverted before dlopen returns.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.