Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190925122110.GU9017@brightrain.aerifal.cx>
Date: Wed, 25 Sep 2019 08:21:10 -0400
From: Rich Felker <dalias@...c.org>
To: d.dorau@....de
Cc: musl@...ts.openwall.com
Subject: Re: Bug report: Memory corrupion due to stale
 robust_list.head pointer

On Wed, Sep 25, 2019 at 12:05:18PM +0200, d.dorau@....de wrote:
> Hello,
> 
> I recently came across a memory corruption in the member "tsd" of struct 
> pthread
> in a scenario where a pthread mutex is intentionally held during fork().
> I experienced this using the lastest release 1.1.23.
> 
> I found that during fork musl resets self->robust_list.pending and 
> self->robust_list.off,
> but not the robust_list.head. The stale pointer to the previously held and 
> reset
> mutex turned out to be the cause for the following corruption.
> 
> I therefore suggest to also reset the list head on fork as such:
> 
> --- a/src/process/fork.c.orig   2019-09-23 11:41:01.381626360 +0200
> +++ b/src/process/fork.c        2019-09-23 11:41:26.657819473 +0200
> @@ -27,6 +27,7 @@
>                 self->tid = __syscall(SYS_gettid);
>                 self->robust_list.off = 0;
>                 self->robust_list.pending = 0;
> +               self->robust_list.head = &self->robust_list.head;
>                 self->next = self->prev = self;
>                 __thread_list_lock = 0;
>                 libc.threads_minus_1 = 0;
> 
> 
> This resolves the issue.
> 
> I am very well aware of the fact that aquiring a mutex during fork and 
> re-initializing 
> in the child appears to result in undefined behaviour (as of 
> pthread_mutex_init(3posix))
> or to be controversial at least.
> 
> However I believe that it should't result in a memory corruption as a 
> result.

This is definitely undefined behavior, for exactly the types of reason
you encountered. I agree it's useful to limit the scope of memory
corruption that might occur on such invalid usage though.

Your patch is probably ok but before merging I'd like to take some
time to consider whether there are any valid usage cases it breaks (so
far, I don't think so) and whether any other easily-caught cases are
missed. For example, it is possible that the child attempts to use the
mutex in non-UB ways like locking or unlocking it, but those should
produce errors before even touching the robust list. In theory there's
a tid-reuse problem that could allow child to unlock a mutex that used
to be held by another thread in the parent, but (1) this can only be
hit if the child is doing non-AS-safe things after a MT fork, and (2)
musl already avoids this problem by using the robust list even for
non-robust mutexes. I don't think (2) is 100% complete in the case
where a MT program forks with non-pshared mutexes and the child makes
new threads, but this is very very far into UB territory.

> To reproduce I wrote a small example which triggers and shows the 
> curruption.
> It also contains a description of the program flow and memory corruption.
> 
> Please find it attached to this mail.
> 
> Please note that the routine to print the robust_list is hacked using 
> hardcoded
> offsets which are aimed at my 32-Bit platform.

Thanks for the detailed report.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.