musl - Re: [PATCH] fix potential ref count overflow in sem

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220903170846.GA9709@brightrain.aerifal.cx>
Date: Sat, 3 Sep 2022 13:08:46 -0400
From: Rich Felker <dalias@...c.org>
To: Alexey Izbyshev <izbyshev@...ras.ru>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] fix potential ref count overflow in sem_open()

On Sat, Sep 03, 2022 at 05:28:56PM +0300, Alexey Izbyshev wrote:
> On 2022-09-03 07:48, Rich Felker wrote:
> >On Sat, Sep 03, 2022 at 03:19:40AM +0300, Alexey Izbyshev wrote:
> >>sem_open() attempts to avoid overflow on the future ref count
> >>increment
> >>by computing the sum of all ref counts in semtab and checking
> >>that it's
> >>not INT_MAX when semtab is locked for slot reservation.  This approach
> >>suffers from a TOCTTOU: by the time semtab is re-locked after
> >>opening a
> >>semaphore, the sum could have already been increased by a concurrent
> >>sem_open(), so it will overflow on the increment. An individual ref
> >>count can be overflowed as well if the call happened to open a
> >>duplicate
> >>of the only other semaphore.
> >>
> >>Moreover, since the overflow avoidance check admits a negative (i.e.
> >>overflowed) sum, after this state is reached, an individual ref count
> >>can be incremented until it reaches 1 again, and then sem_close() will
> >>incorrectly free the semaphore, promoting what could be just a
> >>signed-overflow UB to a use-after-free.
> >>
> >>Fix the overflow check by accounting for the maximum possible
> >>number of
> >>concurrent sem_open() calls that have already reserved a slot, but
> >>haven't incremented the ref count yet.
> >>---
> >>Alternatively, we could use the number of currently reserved slots
> >>instead of SEM_NSEMS_MAX, preserving the ability of individual ref
> >>counts to reach INT_MAX in non-concurrent scenarios. I'm not sure
> >>whether it matters, so I'm sending a smaller of the two fixes.
> >>
> >>Alexey
> >>---
> >> src/thread/sem_open.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >>diff --git a/src/thread/sem_open.c b/src/thread/sem_open.c
> >>index 0ad29de9..611a3f64 100644
> >>--- a/src/thread/sem_open.c
> >>+++ b/src/thread/sem_open.c
> >>@@ -61,7 +61,7 @@ sem_t *sem_open(const char *name, int flags, ...)
> >> 		if (!semtab[i].sem && slot < 0) slot = i;
> >> 	}
> >> 	/* Avoid possibility of overflow later */
> >>-	if (cnt == INT_MAX || slot < 0) {
> >>+	if (cnt > INT_MAX - SEM_NSEMS_MAX || slot < 0) {
> >> 		errno = EMFILE;
> >> 		UNLOCK(lock);
> >> 		return SEM_FAILED;
> >>--
> >>2.37.2
> >
> >Thanks! This is probably acceptable at least relative to the current
> >behavior, but thinking about it, the current behavior (this whole
> >logic) doesn't really make sense. If flags are such that a new
> >semaphore can't be created, the claim that there's no way to handle
> >failure after opening is false; the operation is side-effect free and
> >we can just back out. The only case where we can't back out is when
> >we're creating a new semaphore, and in that case, it will never be
> >incrementing the refcnt on an existing slot; it will always be a new
> >slot. And even in this case, I think we could back out if we needed
> >to, since the file is created initially with a temp name.
> 
> Indeed, my patch tried to fix the issue while staying within the
> wider logic of the current implementation. Since it makes the newly
> created semaphore visible via link() *before* re-locking semtab, the
> following scenario could happen:
> 
> * thread 1 calls sem_open("/sem", O_CREAT) and proceeds until after
> link() succeeds
> * other threads successfully call sem_open("/sem", 0) INT_MAX times
> * thread 1 re-locks semtab, discovers another slot with the i_no it
> created, but can't increment the refcnt
> 
> So in this case even though thread 1 did create a new semaphore, it
> still has to use an existing slot and hence can't back out (if it
> does, it'd look like semaphore creation didn't succeed, but opening
> the semaphore did).
> 
> If link() and the refcnt increment occur under the same lock, this
> issue can be avoided, as in the attached draft patch.
> 
> There is still a separate spurious failure case when after opening
> SEM_NSEMS_MAX semaphores an application can't open even an already
> opened one, but I don't think it's possible to completely fix in
> case O_CREAT is specified because we can't know whether we need to
> allocate a new slot or not before link() returns.
> 
> Thanks,
> Alexey

> diff --git a/src/thread/sem_open.c b/src/thread/sem_open.c
> index 0ad29de9..7f648c19 100644
> --- a/src/thread/sem_open.c
> +++ b/src/thread/sem_open.c
> @@ -34,7 +34,7 @@ sem_t *sem_open(const char *name, int flags, ...)
>  	va_list ap;
>  	mode_t mode;
>  	unsigned value;
> -	int fd, i, e, slot, first=1, cnt, cs;
> +	int fd, i, e, slot, first=1, cs, created=0;
>  	sem_t newsem;
>  	void *map;
>  	char tmp[64];
> @@ -55,13 +55,8 @@ sem_t *sem_open(const char *name, int flags, ...)
>  	/* Reserve a slot in case this semaphore is not mapped yet;
>  	 * this is necessary because there is no way to handle
>  	 * failures after creation of the file. */
> -	slot = -1;
> -	for (cnt=i=0; i<SEM_NSEMS_MAX; i++) {
> -		cnt += semtab[i].refcnt;
> -		if (!semtab[i].sem && slot < 0) slot = i;
> -	}
> -	/* Avoid possibility of overflow later */
> -	if (cnt == INT_MAX || slot < 0) {
> +	for (slot=0; slot<SEM_NSEMS_MAX && semtab[slot].sem; slot++);
> +	if (slot == SEM_NSEMS_MAX) {
>  		errno = EMFILE;
>  		UNLOCK(lock);
>  		return SEM_FAILED;
> @@ -93,6 +88,7 @@ sem_t *sem_open(const char *name, int flags, ...)
>  					goto fail;
>  				}
>  				close(fd);
> +				LOCK(lock);
>  				break;
>  			}
>  			if (errno != ENOENT)
> @@ -128,9 +124,16 @@ sem_t *sem_open(const char *name, int flags, ...)
>  			goto fail;
>  		}
>  		close(fd);
> +
> +		LOCK(lock);
>  		e = link(tmp, name) ? errno : 0;
>  		unlink(tmp);
> -		if (!e) break;
> +		if (!e) {
> +			created = 1;
> +			break;
> +		}
> +		UNLOCK(lock);
> +
>  		munmap(map, sizeof(sem_t));
>  		/* Failure is only fatal when doing an exclusive open;
>  		 * otherwise, next iteration will try to open the
> @@ -142,13 +145,19 @@ sem_t *sem_open(const char *name, int flags, ...)
>  	/* See if the newly mapped semaphore is already mapped. If
>  	 * so, unmap the new mapping and use the existing one. Otherwise,
>  	 * add it to the table of mapped semaphores. */
> -	LOCK(lock);
> -	for (i=0; i<SEM_NSEMS_MAX && semtab[i].ino != st.st_ino; i++);
> -	if (i<SEM_NSEMS_MAX) {
> -		munmap(map, sizeof(sem_t));
> -		semtab[slot].sem = 0;
> -		slot = i;
> -		map = semtab[i].sem;
> +	if (!created) {
> +		for (i=0; i<SEM_NSEMS_MAX && semtab[i].ino != st.st_ino; i++);
> +		if (i<SEM_NSEMS_MAX) {
> +			munmap(map, sizeof(sem_t));
> +			if (semtab[i].refcnt == INT_MAX) {
> +				UNLOCK(lock);
> +				errno = EMFILE;
> +				goto fail;
> +			}
> +			semtab[slot].sem = 0;
> +			slot = i;
> +			map = semtab[i].sem;
> +		}
>  	}
>  	semtab[slot].refcnt++;
>  	semtab[slot].sem = map;

One thought: in light of the complexity we're fighting with here, does
unlocking and relocking just to save some contention from a long
critical section containing syscalls like open, etc. really make
sense? If we just held the lock the whole time, none of this would
matter. It is quite a big critical section, but it's not *blocking*,
and aside from silly intentional create-and-unlink dances to force
repeated retries, it's one where everything makes forward progress.

Not sure if this is the right thing to do but I think it's worth
considering.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.