|
Message-ID: <20220903170846.GA9709@brightrain.aerifal.cx> Date: Sat, 3 Sep 2022 13:08:46 -0400 From: Rich Felker <dalias@...c.org> To: Alexey Izbyshev <izbyshev@...ras.ru> Cc: musl@...ts.openwall.com Subject: Re: [PATCH] fix potential ref count overflow in sem_open() On Sat, Sep 03, 2022 at 05:28:56PM +0300, Alexey Izbyshev wrote: > On 2022-09-03 07:48, Rich Felker wrote: > >On Sat, Sep 03, 2022 at 03:19:40AM +0300, Alexey Izbyshev wrote: > >>sem_open() attempts to avoid overflow on the future ref count > >>increment > >>by computing the sum of all ref counts in semtab and checking > >>that it's > >>not INT_MAX when semtab is locked for slot reservation. This approach > >>suffers from a TOCTTOU: by the time semtab is re-locked after > >>opening a > >>semaphore, the sum could have already been increased by a concurrent > >>sem_open(), so it will overflow on the increment. An individual ref > >>count can be overflowed as well if the call happened to open a > >>duplicate > >>of the only other semaphore. > >> > >>Moreover, since the overflow avoidance check admits a negative (i.e. > >>overflowed) sum, after this state is reached, an individual ref count > >>can be incremented until it reaches 1 again, and then sem_close() will > >>incorrectly free the semaphore, promoting what could be just a > >>signed-overflow UB to a use-after-free. > >> > >>Fix the overflow check by accounting for the maximum possible > >>number of > >>concurrent sem_open() calls that have already reserved a slot, but > >>haven't incremented the ref count yet. > >>--- > >>Alternatively, we could use the number of currently reserved slots > >>instead of SEM_NSEMS_MAX, preserving the ability of individual ref > >>counts to reach INT_MAX in non-concurrent scenarios. I'm not sure > >>whether it matters, so I'm sending a smaller of the two fixes. > >> > >>Alexey > >>--- > >> src/thread/sem_open.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >>diff --git a/src/thread/sem_open.c b/src/thread/sem_open.c > >>index 0ad29de9..611a3f64 100644 > >>--- a/src/thread/sem_open.c > >>+++ b/src/thread/sem_open.c > >>@@ -61,7 +61,7 @@ sem_t *sem_open(const char *name, int flags, ...) > >> if (!semtab[i].sem && slot < 0) slot = i; > >> } > >> /* Avoid possibility of overflow later */ > >>- if (cnt == INT_MAX || slot < 0) { > >>+ if (cnt > INT_MAX - SEM_NSEMS_MAX || slot < 0) { > >> errno = EMFILE; > >> UNLOCK(lock); > >> return SEM_FAILED; > >>-- > >>2.37.2 > > > >Thanks! This is probably acceptable at least relative to the current > >behavior, but thinking about it, the current behavior (this whole > >logic) doesn't really make sense. If flags are such that a new > >semaphore can't be created, the claim that there's no way to handle > >failure after opening is false; the operation is side-effect free and > >we can just back out. The only case where we can't back out is when > >we're creating a new semaphore, and in that case, it will never be > >incrementing the refcnt on an existing slot; it will always be a new > >slot. And even in this case, I think we could back out if we needed > >to, since the file is created initially with a temp name. > > Indeed, my patch tried to fix the issue while staying within the > wider logic of the current implementation. Since it makes the newly > created semaphore visible via link() *before* re-locking semtab, the > following scenario could happen: > > * thread 1 calls sem_open("/sem", O_CREAT) and proceeds until after > link() succeeds > * other threads successfully call sem_open("/sem", 0) INT_MAX times > * thread 1 re-locks semtab, discovers another slot with the i_no it > created, but can't increment the refcnt > > So in this case even though thread 1 did create a new semaphore, it > still has to use an existing slot and hence can't back out (if it > does, it'd look like semaphore creation didn't succeed, but opening > the semaphore did). > > If link() and the refcnt increment occur under the same lock, this > issue can be avoided, as in the attached draft patch. > > There is still a separate spurious failure case when after opening > SEM_NSEMS_MAX semaphores an application can't open even an already > opened one, but I don't think it's possible to completely fix in > case O_CREAT is specified because we can't know whether we need to > allocate a new slot or not before link() returns. > > Thanks, > Alexey > diff --git a/src/thread/sem_open.c b/src/thread/sem_open.c > index 0ad29de9..7f648c19 100644 > --- a/src/thread/sem_open.c > +++ b/src/thread/sem_open.c > @@ -34,7 +34,7 @@ sem_t *sem_open(const char *name, int flags, ...) > va_list ap; > mode_t mode; > unsigned value; > - int fd, i, e, slot, first=1, cnt, cs; > + int fd, i, e, slot, first=1, cs, created=0; > sem_t newsem; > void *map; > char tmp[64]; > @@ -55,13 +55,8 @@ sem_t *sem_open(const char *name, int flags, ...) > /* Reserve a slot in case this semaphore is not mapped yet; > * this is necessary because there is no way to handle > * failures after creation of the file. */ > - slot = -1; > - for (cnt=i=0; i<SEM_NSEMS_MAX; i++) { > - cnt += semtab[i].refcnt; > - if (!semtab[i].sem && slot < 0) slot = i; > - } > - /* Avoid possibility of overflow later */ > - if (cnt == INT_MAX || slot < 0) { > + for (slot=0; slot<SEM_NSEMS_MAX && semtab[slot].sem; slot++); > + if (slot == SEM_NSEMS_MAX) { > errno = EMFILE; > UNLOCK(lock); > return SEM_FAILED; > @@ -93,6 +88,7 @@ sem_t *sem_open(const char *name, int flags, ...) > goto fail; > } > close(fd); > + LOCK(lock); > break; > } > if (errno != ENOENT) > @@ -128,9 +124,16 @@ sem_t *sem_open(const char *name, int flags, ...) > goto fail; > } > close(fd); > + > + LOCK(lock); > e = link(tmp, name) ? errno : 0; > unlink(tmp); > - if (!e) break; > + if (!e) { > + created = 1; > + break; > + } > + UNLOCK(lock); > + > munmap(map, sizeof(sem_t)); > /* Failure is only fatal when doing an exclusive open; > * otherwise, next iteration will try to open the > @@ -142,13 +145,19 @@ sem_t *sem_open(const char *name, int flags, ...) > /* See if the newly mapped semaphore is already mapped. If > * so, unmap the new mapping and use the existing one. Otherwise, > * add it to the table of mapped semaphores. */ > - LOCK(lock); > - for (i=0; i<SEM_NSEMS_MAX && semtab[i].ino != st.st_ino; i++); > - if (i<SEM_NSEMS_MAX) { > - munmap(map, sizeof(sem_t)); > - semtab[slot].sem = 0; > - slot = i; > - map = semtab[i].sem; > + if (!created) { > + for (i=0; i<SEM_NSEMS_MAX && semtab[i].ino != st.st_ino; i++); > + if (i<SEM_NSEMS_MAX) { > + munmap(map, sizeof(sem_t)); > + if (semtab[i].refcnt == INT_MAX) { > + UNLOCK(lock); > + errno = EMFILE; > + goto fail; > + } > + semtab[slot].sem = 0; > + slot = i; > + map = semtab[i].sem; > + } > } > semtab[slot].refcnt++; > semtab[slot].sem = map; One thought: in light of the complexity we're fighting with here, does unlocking and relocking just to save some contention from a long critical section containing syscalls like open, etc. really make sense? If we just held the lock the whole time, none of this would matter. It is quite a big critical section, but it's not *blocking*, and aside from silly intentional create-and-unlink dances to force repeated retries, it's one where everything makes forward progress. Not sure if this is the right thing to do but I think it's worth considering. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.