Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200219141335.GF1663@brightrain.aerifal.cx>
Date: Wed, 19 Feb 2020 09:13:35 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: race condition in sem_wait

On Wed, Feb 19, 2020 at 09:26:30AM +0100, Sebastian Gottschall wrote:
> 
> Am 19.02.2020 um 04:39 schrieb Rich Felker:
> >On Wed, Feb 19, 2020 at 01:46:34AM +0100, Sebastian Gottschall wrote:
> >>Hello
> >>
> >>i discovered recently a race condition while playing with threads
> >>and sem_wait/sem_post
> >>sem_wait may fail with errno set EAGAIN which is not valid since
> >>only sem_trywait is able to set that errno code.
> >>this was causing a bug with a later select() and accept() which
> >>failed since accept does not work if errno is set to EAGAIN.
> >>from my point of view the bug is in sem_timedwait.c
> >>
> >>         if (!sem_trywait(sem)) return 0;
> >>
> >>         int spins = 100;
> >>         while (spins-- && sem->__val[0] <= 0 && !sem->__val[1]) a_spin();
> >>
> >>         while (sem_trywait(sem)) {
> >>
> >>
> >>the fist sem_trywait will fail with -1 and sets EAGAIN. but the
> >>second sem_trywait will not fail and does return 0. the problem now
> >>is that errno is still present and not reset.
> >>this may cause if sem_post is called from a second thread on the
> >>same semaphore.
> >>of course the same bug affects sem_timedwait itself.
> >>so i assume sem_wait is not thread safe which is bad and is not
> >>follow the posix specification
> >>
> >>or am i wrong here?
> >errno is only meaningful on failure; unless specified otherwise (a few
> >functions are special because you can't [easily] distinguish success
> >from failure for them without examining errno), any standard function
> >may have changed the value of errno when it returns with success. The
> >only thing it's not allowed to do is clear it (set it to 0).
> the problem is the posix manual specifies exclicit that EAGAIN
> cannot be returned by sem_wait and in my code sample
> 
> the following happens
> 
> sem_wait(semaphort)
> select(....)
> socket = accept(....)  -> fails
> 
> accept fails because sem_wait did set errno to EAGAIN and accept
> will fail if errno is set to EAGAIN
> i use sem_wait to limit the number of threads in my webserver. on
> the thread itself i call sem_post.
> but to make it work correct i have to set errno=0 before calling
> accept since accept will not work if errno is set to EAGAIN
> if you read the posix man for accept, you will find out that accept
> will read errno unconditional and this is also the case for the musl
> implementation

accept does not use errno as input. Unless I'm forgetting something,
no interfaces in libc except perror, syslog (%m), and *printf (%m
extension) use errno as input. If accept is failing (returning -1)
with errno==EAGAIN it's not because errno was EAGAIN before you called
it but because your listening socket is in non-blocking mode and there
is no pending connection to accept.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.