musl - pthread_getcpuclockid and POSIX

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZwVPHvxufLMVs0WR@voyager>
Date: Tue, 8 Oct 2024 17:26:22 +0200
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: pthread_getcpuclockid and POSIX

Hi all,

I am already sorry in advance that this is hardly going to contain
constructive criticism, because at this point, I see a few problems and
no real solutions.

pthread_getcpuclockid() is a function that is supposed to return a
clockid that can be passed to the clock_* and timer_* functions and
references the CPU time of the thread identified by its handle.
Currently, we simply return an encoding of the raw kernel TID. This
means the clockid becomes invalid as soon as the thread terminates.

Minor gripe: Isn't reading the TID from another thread's descriptor
without holding the killlock a potential data race?

Major gripe: If called on a zombie thread, this successfully returns a
clockid that is equivalent to CLOCK_THREAD_CPUTIME_ID. glibc returns
ESRCH in that case, which POSIX neither condones nor condemns. It says
in an informative section that that should be returned for thread
handles used after the end of life of a thread, but the definitions
section in XSH explicitly includes the zombie phase in the thread
lifetime.

POSIX doesn't really say how long such a clockid should be valid, but I
should think the life of the thread at least, and again, that includes
the zombie phase. But our current implementation doesn't deliver that.
If the clockid is retrieved after a thread becomes a zombie, then we
return a valid clockid that doesn't do what the caller thinks it does,
and if called before that, we return a clockid that might not work when
used (if the thread exited between the calls to pthread_getcpuclockid()
and clock_gettime()) or, worse, might refer to a completely different
thread (if a new thread with the same TID was created).

What can be done? If we could somehow encode the thread descriptor into
the clock ID, then clock_gettime() could recognize those CPU times and
do some special handling. For example, it could take the killlock on the
thread and then perform the syscall only if the thread is still alive.
pthread_exit() could store the final CPU time in the thread descriptor
at the same time as it is clearing out the TID, and clock_gettime()
could return that if the thread is no longer alive. Of course, all the
other clock_* and timer_* functions would have to change as well.

Alas, this can probably not be done, as clockid_t is defined to be an
int, and we cannot define it to be larger without breaking ABI.
Shoveling 64 pounds of stuff into a 32-pound bag has never worked
before, so I don't think we can represent a pthread_t in a clockid_t.
Let alone doing so in a way that preserves the static clocks already in
use (and compiled into binary programs), and the dynamic process CPU
time clocks clock_getcpuclockid() returns.

I also struggle to find any users of this interface that aren't always
calling it with the first argument set to pthread_self(). I did find
OpenJDK, which calls it on its main thread. But otherwise, nothing.

Ciao,
Markus
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.