musl - Re: pthread_getcpuclockid and POSIX

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241009173123.GJ10433@brightrain.aerifal.cx>
Date: Wed, 9 Oct 2024 13:31:23 -0400
From: Rich Felker <dalias@...c.org>
To: Markus Wichmann <nullplan@....net>
Cc: musl@...ts.openwall.com
Subject: Re: pthread_getcpuclockid and POSIX

On Tue, Oct 08, 2024 at 05:26:22PM +0200, Markus Wichmann wrote:
> Hi all,
> 
> I am already sorry in advance that this is hardly going to contain
> constructive criticism, because at this point, I see a few problems and
> no real solutions.
> 
> pthread_getcpuclockid() is a function that is supposed to return a
> clockid that can be passed to the clock_* and timer_* functions and
> references the CPU time of the thread identified by its handle.
> Currently, we simply return an encoding of the raw kernel TID. This
> means the clockid becomes invalid as soon as the thread terminates.
> 
> Minor gripe: Isn't reading the TID from another thread's descriptor
> without holding the killlock a potential data race?

Yes, at least the way we currently clear the tid on exit.

> Major gripe: If called on a zombie thread, this successfully returns a
> clockid that is equivalent to CLOCK_THREAD_CPUTIME_ID. glibc returns
> ESRCH in that case, which POSIX neither condones nor condemns. It says
> in an informative section that that should be returned for thread
> handles used after the end of life of a thread, but the definitions
> section in XSH explicitly includes the zombie phase in the thread
> lifetime.

The specification includes the text "if the thread specified by
thread_id exists", which does not exactly match wording about
pthread_t lifetime, so arguably there may be an allowance not to do so
for threads that have exited. But indeed the bigger problem is that:

> POSIX doesn't really say how long such a clockid should be valid, but I
> should think the life of the thread at least, and again, that includes
> the zombie phase. But our current implementation doesn't deliver that.
> If the clockid is retrieved after a thread becomes a zombie, then we
> return a valid clockid that doesn't do what the caller thinks it does,
> and if called before that, we return a clockid that might not work when
> used (if the thread exited between the calls to pthread_getcpuclockid()
> and clock_gettime()) or, worse, might refer to a completely different
> thread (if a new thread with the same TID was created).
> 
> What can be done?

I think this should be raised with the Austin Group, probably
requesting an allowance that use after the thread has returned from
its start function or called pthread_exit is undefined.

> If we could somehow encode the thread descriptor into
> the clock ID, then clock_gettime() could recognize those CPU times and
> do some special handling. For example, it could take the killlock on the
> thread and then perform the syscall only if the thread is still alive.
> pthread_exit() could store the final CPU time in the thread descriptor
> at the same time as it is clearing out the TID, and clock_gettime()
> could return that if the thread is no longer alive. Of course, all the
> other clock_* and timer_* functions would have to change as well.

This also incurs significant cost for functionality that is likely
never to be used.

> Alas, this can probably not be done, as clockid_t is defined to be an
> int, and we cannot define it to be larger without breaking ABI.
> Shoveling 64 pounds of stuff into a 32-pound bag has never worked
> before, so I don't think we can represent a pthread_t in a clockid_t.
> Let alone doing so in a way that preserves the static clocks already in
> use (and compiled into binary programs), and the dynamic process CPU
> time clocks clock_getcpuclockid() returns.

The other option, which would also greatly simplify our recursive and
errorchecking mutexes not to need to track ownership in the robust
list just to make a fake-owner in the case where a thread exits with a
lock held, would be to defer exit of the kernel thread until join for
joinable threads. That's also fairly high cost, but at least it
recovers some cost we spent elsewhere, so maybe it could be considered
if Austin Group is uncooperative about making this less costly.

> I also struggle to find any users of this interface that aren't always
> calling it with the first argument set to pthread_self(). I did find
> OpenJDK, which calls it on its main thread. But otherwise, nothing.

Yes, it's an extraordinarily useless source of implementation costs...

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.