Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 27 Sep 2019 10:52:25 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Bug report: Memory corrupion due to stale
 robust_list.head pointer

On Wed, Sep 25, 2019 at 08:21:10AM -0400, Rich Felker wrote:
> On Wed, Sep 25, 2019 at 12:05:18PM +0200, d.dorau@....de wrote:
> > Hello,
> > 
> > I recently came across a memory corruption in the member "tsd" of struct 
> > pthread
> > in a scenario where a pthread mutex is intentionally held during fork().
> > I experienced this using the lastest release 1.1.23.
> > 
> > I found that during fork musl resets self->robust_list.pending and 
> > self->robust_list.off,
> > but not the robust_list.head. The stale pointer to the previously held and 
> > reset
> > mutex turned out to be the cause for the following corruption.
> > 
> > I therefore suggest to also reset the list head on fork as such:
> > 
> > --- a/src/process/fork.c.orig   2019-09-23 11:41:01.381626360 +0200
> > +++ b/src/process/fork.c        2019-09-23 11:41:26.657819473 +0200
> > @@ -27,6 +27,7 @@
> >                 self->tid = __syscall(SYS_gettid);
> >                 self->robust_list.off = 0;
> >                 self->robust_list.pending = 0;
> > +               self->robust_list.head = &self->robust_list.head;
> >                 self->next = self->prev = self;
> >                 __thread_list_lock = 0;
> >                 libc.threads_minus_1 = 0;
> > 
> > 
> > This resolves the issue.
> > 
> > I am very well aware of the fact that aquiring a mutex during fork and 
> > re-initializing 
> > in the child appears to result in undefined behaviour (as of 
> > pthread_mutex_init(3posix))
> > or to be controversial at least.
> > 
> > However I believe that it should't result in a memory corruption as a 
> > result.
> 
> This is definitely undefined behavior, for exactly the types of reason
> you encountered. I agree it's useful to limit the scope of memory
> corruption that might occur on such invalid usage though.
> 
> Your patch is probably ok but before merging I'd like to take some
> time to consider whether there are any valid usage cases it breaks (so
> far, I don't think so) and whether any other easily-caught cases are
> missed. For example, it is possible that the child attempts to use the
> mutex in non-UB ways like locking or unlocking it, but those should
> produce errors before even touching the robust list. In theory there's
> a tid-reuse problem that could allow child to unlock a mutex that used
> to be held by another thread in the parent, but (1) this can only be
> hit if the child is doing non-AS-safe things after a MT fork, and (2)
> musl already avoids this problem by using the robust list even for
> non-robust mutexes. I don't think (2) is 100% complete in the case
> where a MT program forks with non-pshared mutexes and the child makes
> new threads, but this is very very far into UB territory.

I was thinking it might be valid for fork to walk the robust list and
put all locked non-pshared mutexes from the parent in a
permanently-unusable state in the child, before clearing the robust
list pointer. However, on further consideration it doesn't seem valid
to even access the robust list at all in the child, since it might
contain entries which exist in shared memory and which could be
unlocked by the parent (thereby invalidating the prev/next pointers)
before the child can access them. So I think we should just be content
that the cases tid-reuse can affect are UB anyway.

Note that all of this is an aside, and does not seem like a
counterindication for the safety patch proposed above.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.