|
Message-ID: <1409087756.4476.2.camel@eris.loria.fr>
Date: Tue, 26 Aug 2014 23:15:56 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: Multi-threaded performance progress
Am Dienstag, den 26.08.2014, 16:26 -0400 schrieb Rich Felker:
> On Tue, Aug 26, 2014 at 09:34:13PM +0200, Jens Gustedt wrote:
> > Am Dienstag, den 26.08.2014, 15:05 -0400 schrieb Rich Felker:
> > > On Tue, Aug 26, 2014 at 08:30:39PM +0200, Jens Gustedt wrote:
> > > > Or do you mean that I should use an atomic store at the other end?
> > >
> > > Yes. With an atomic store at the other end, I think it could be
> > > correct, but I'd need to review it further to be sure.
> >
> > ok, it shouldn't be difficult to use atomic ops, then.
>
> Based on what you've said below, though, I think there's still a big
> problem..
I am not sure that I follow.
> > > > > Note that the performance of the code in which you're trying to avoid
> > > > > the lock does not matter in the slightest except when a race happens
> > > > > between a thread acting on cancellation or timeout and a signaler
> > > > > (since that's the only time it runs). I expect this is extremely rare,
> > > > > so unless we determine otherwise, I'd rather not add complexity here.
> > > >
> > > > If we have a broadcaster working a long list of waiters, this might
> > > > still happen sufficiently often. And the "complexity" is hidden in the
> > > > execution pattern of the current version, where control and handling
> > > > of the list alternates between different threads, potentially as many
> > > > times as there are waiters in the list.
> > >
> > > Does your code eliminate that all the time? If so it's more
> > > interesting. I'll look at it more.
> >
> > Yes, it will be the signaling or broadcasting thread that will be
> > working on the integrity of the list while it is holding the lock. At
> > the end those that it detected to be leaving will be isolated list
> > items, those that are to be signaled form one consecutive list that is
> > detached from the cv, and the ones that remain (in the signaling case)
> > form a valid cv-with-list.
> >
> > The only small gap that remains (and that annoys me) is the leaving
> > thread that sneaks in
> >
> > - marks itself as leaving before the end of the the CS
> > - only asks for _c_lock *after* the signaling thread has left its CS
> >
> > This is all our problem of late access to the cv variable revisited,
> > but here it is condensed in a very narrow time frame. Both threads
> > must be active for this to happen, so my hope is that when both are
> > spinning for some time on the c_lock for the waiter and on ref for
> > the signaler, none of them will "ever" be forced into a futex wait.
>
> That's a bug thast needs to be fixed to go forward with this, since
> it's essentially a use-after-free. Now that you mention it, avoiding
> use-after-free was one of my main motivations for having such waiters
> synchronize with the signaling thread. That should have been
> documented in a comment somewhere, but the point seems to have slipped
> my mind sometime between the design phase and writing the code and
> comments.
here I don't follow either.
What I describe is an inconvenience that forces all these mechanics
with the signaller still waiting for the ref count to fall to 0 before
he can leave and abandon the cv. As far as I can see that should still
be working with both versions.
> Do you see any solution whereby a waiter that's waking up can know
> reliably, without accessing the cv, whether a signaling thread is
> there to take responsibility for removing it from the list?
No. My intent with the description above was to illustrate where that
problem resides in the version that I proposed.
> I'm not seeing any solution to that problem.
You mean in the code that I proposed? No there is no other solution,
than what you have in your original version: have the signalling
thread wait for all the leaving waiters to come in. In your version
that write to node->notify is protected by _c_lock, so this is not
subject to races.
In my version this is the same, unless the lockRace variant gives the
leaving thread some additional information. If this information about
ref arrives in time before the _c_lock is taken it is used, and the
leaving thread never touches the cv, again. If it doesn't arrive in
time the _c_lock is waited for, then taken, the notifier is
re-evaluated, and all is fine. (__wait should be a memory barrier,
shouldn't it?)
But you are right I should add atomicity, there.
> I'm also still skeptical that there's a problem to be solved here; for
> it to matter, the incidence of such races needs to be pretty high, I
> think.
"matter" can mean different things. I don't think that it my version
penalizes performance (but sure a proof would be better), it could
also give some small benefits concerning execution time, it could be
mildly better by avoiding cache-ping-pong and by avoiding some futex
calls.
For me, it matters because the handling is "simpler" (a proof would be
much, much better, here :) in that there is one single thread
manipulating the list. So I should work more on prooving that point by
keeping it much closer to your original version.
Ok, I'll try to switch to one-step-at-a-time-mode.
> Perhaps if you think it's going to matter you could work on a
> test case that shows performance problems under load with lots of
> timedwait expirations (or cancellations, but I think worry about
> cancellation performance is somewhat silly to begin with).
Yes, since we have eliminated spurious wakeups ;) expirations should
be thing to worry about.
> Or, if you don't have time to spend on side projects like test
> cases, maybe someone else could test it?
Not at the moment. I would prefer to have a C thread implementation by
the end of the week that we could agree upon.
Jens
--
:: INRIA Nancy Grand Est ::: AlGorille ::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536 ::
:: :::::::::::::::::::::: gsm France : +33 651400183 ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.