|
Message-ID: <alpine.LNX.2.20.1507301228230.11825@monopod.intra.ispras.ru> Date: Thu, 30 Jul 2015 12:36:19 +0300 (MSK) From: Alexander Monakov <amonakov@...ras.ru> To: musl@...ts.openwall.com Subject: Re: New optimized normal-type mutex? On Thu, 30 Jul 2015, Jens Gustedt wrote: > Am Donnerstag, den 30.07.2015, 10:07 +0200 schrieb Jens Gustedt: > > Am Mittwoch, den 29.07.2015, 20:10 -0400 schrieb Rich Felker: > > > On Thu, Jul 30, 2015 at 01:49:20AM +0200, Jens Gustedt wrote: > > > > Hm, could you be more specific about where this hurts? > > > > > > > > In the code I have there is > > > > > > > > for (;val & lockbit;) { > > > > __syscall(SYS_futex, loc, FUTEX_WAIT, val, 0); > > > > val = atomic_load_explicit(loc, memory_order_consume); > > > > } > > > > > > > > so this should be robust against spurious wakeups, no? > > > > > > The problem is that futex_wait returns immediately with EAGAIN if > > > *loc!=val, which happens very often if *loc is incremented or > > > otherwise changed on each arriving waiter. > > > > Yes, sure, it may change. Whether or not this is often may depend, I > > don't think we can easily make a quantitative statement, here. > > > > In the case of atomics the critical section is extremely short, and > > the count, if it varies so much, should have a bit stabilized during > > the spinlock phase before coming to the futex part. That futex part is > > really only a last resort for the rare case that the thread that is > > holding the lock has been descheduled in the middle. > > > > My current test case is having X threads hammer on one single > > location, X being up to some hundred. On my 2x2 hyperthreaded CPU for > > a reasonable number of threads (X = 16 or 32) I have an overall > > performance improvement of 30%, say, when using my version of the lock > > instead of the original musl one. The point of inversion where the > > original musl lock is better is at about 200 threads. > > > > I'll see how I can get hold on occurrence statistics of the different > > phases without being too intrusive (which would change the > > scheduling). > > So I tested briefly varying the number of threads from 2 up to 2048. > > Out of the loop iterations on the slow path, less than 0.1 % try to go > into futex wait, and out of these about 20 % come back with EGAIN. That sounds like your testcase simulates a load where you'd be better off with a spinlock in the first place, no? Have you tried simulating a load that does some non-trivial work between lock/unlock, making a spinlock a poor fit? Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.