musl - Re: What's left for 1.1.11 release?

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150728173141.GV16376@brightrain.aerifal.cx>
Date: Tue, 28 Jul 2015 13:31:41 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: What's left for 1.1.11 release?

On Tue, Jul 28, 2015 at 05:33:18PM +0300, Alexander Monakov wrote:
> > > and stdio locks too, but it's only been observed in malloc.
> > > Since there don't seem to be any performance-relevant uses of a_store
> > > that don't actually need the proper barrier, I think we have to just
> > > put an explicit barrier (lock orl $0,(%esp) or mfence) after the store
> > > and live with the loss of performance.
> > 
> > How about using a xchg as instruction? This would perhaps "waste" a
> > register, but that sort of optimization should not be critical in the
> > vicinity of code that needs memory synchronization, anyhow.
> 
> xchg is what compilers use in lieu of mfence, but Rich's preference for 'lock
> orl' on the top of the stack stems from the idea that locking on the store
> destination is not desired here (you might not even have the corresponding
> line in the cache), so it might be better to have the store land in the store
> buffers, and do a serializing 'lock orl' on the cache line you have anyhow.

I did a quick run of my old malloc stress test with both approaches.
The outputs are not sufficiently stable to gather a lot, but on my
machine, there seems to be no loss in performance with the stack
approach and a 1-5% loss from using xchg to do the store. I'd like to
have a better measurement to confirm this, but being that my
measurements so far agree with the theoretical prediction, I think
I'll just go with the stack approach for now.

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.