Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180528221521.GI1392@brightrain.aerifal.cx>
Date: Mon, 28 May 2018 18:15:21 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: TLS issue on aarch64

On Mon, May 28, 2018 at 10:47:31PM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@...c.org> [2018-05-26 20:34:30 -0400]:
> > On Sat, May 26, 2018 at 02:54:16AM +0200, Szabolcs Nagy wrote:
> > > (on mips/ppc i expect it not to change anything: tp is
> > > at a page aligned offset from the end of struct pthread,
> > > so one alignment is enough there, but on aarch64/arm/sh4
> > > this makes a difference, and seems to pass my simple tests)
> > > 
> > > diff --git a/src/env/__init_tls.c b/src/env/__init_tls.c
> > > index 1c5d98a0..8e70024d 100644
> > > --- a/src/env/__init_tls.c
> > > +++ b/src/env/__init_tls.c
> > > @@ -41,9 +41,12 @@ void *__copy_tls(unsigned char *mem)
> > >  #ifdef TLS_ABOVE_TP
> > >  	dtv = (void **)(mem + libc.tls_size) - (libc.tls_cnt + 1);
> > >  
> > > -	mem += -((uintptr_t)mem + sizeof(struct pthread)) & (libc.tls_align-1);
> > > +	/* Ensure TP is aligned.  */
> > > +	mem += -(uintptr_t)TP_ADJ(mem) & (libc.tls_align-1);
> > >  	td = (pthread_t)mem;
> > >  	mem += sizeof(struct pthread);
> > > +	/* Ensure TLS is aligned after struct pthread.  */
> > > +	mem += -(uintptr_t)mem & (libc.tls_align-1);
> > >  
> > >  	for (i=1, p=libc.tls_head; p; i++, p=p->next) {
> > >  		dtv[i] = mem + p->offset;
> > 
> > As written this (or anything using libc.tls_align to adjust offset of
> > the TLS from the TP) is not valid. The value of libc.tls_align is
> > runtime-variable and will increase upon dlopen, and even without
> > dlopen, will be non-deterministic dependent on shared libraries from
> > DT_NEEDED in dynamic-linked programs. The offset between TP and TLS is
> > a property of the linker's handling of local-exec TLS in the main
> > program only, and thus probably should be using libc.tls_head.align.
> > 
> 
> ok, makes sense.
> 
> > However, care needs to be taken that libc.tls_head may initially be
> > null if the main program has no TLS, but could later become non-null
> > due to dlopen. If the offset between TP and TLS changed due to this,
> > any initial-exec-model TLS access would be wrong. Fortunately such a
> > program cannot have initial-exec-model accesses (initial-exec is only
> > valid for TLS that existed at program start), so we can probably just
> > ignore the issue and always use libc.tls_head?libc.tls_head.align:1;
> > this will cause gratuitous padding for threads created after dlopen of
> > a library with larger alignment, but should otherwise not hurt
> > anything.
> > 
> 
> yes i think we only need to consider the tls alignment requirements
> of the main executable, if libc.tls_head can only be changed by
> loading libs with initial-exec tls that should be fine.

The only transition possible for libc.tls_head is from null to
non-null, and it's only initially null if there is initially no TLS,
in which case the initial-exec (and local-exec) models are invalid.

> another issue with the patch is that if tp is aligned then pthread_t
> may not get aligned:
> 
> tp == self + sizeof(pthread_t) - reserved

This expression does not look correct; it would have tp point
somewhere inside of the pthread structure rather than just past the
end of it.

Maybe your code is doing this to avoid wasted padding, but if so I
think that's a bad idea. It violates the invariant that the pthread
structure is at a constant offset from tp, which is needed for
efficient pthread_self and for access to fixed slots at the end of it
(like canary or dtv copy).

> so sizeof(pthread_t) - reserved must be divisible with
> gcd(alignment of tp, alignof(pthread_t)) to be able to make both
> self and tp aligned.
> 
> this is not an issue on current targets with current pthread_t,
> but we may want to decouple internal struct pthread alignment
> details and the abi reserved tls size, i.e. tp_adj could be like
> 
> tp == alignup(self + sizeof(pthread_t) - reserved, alignof(pthread_t))
> 
> or we add a static assert that reserved and alignof(pthread_t)
> are not conflicting.

Maybe I'm misunderstanding what "reserved" is, since you're talking
about a static assert...?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.