|
Message-ID: <20221004083120.GI2158779@port70.net> Date: Tue, 4 Oct 2022 10:31:20 +0200 From: Szabolcs Nagy <nsz@...t70.net> To: musl@...ts.openwall.com Subject: Re: Illegal killlock skipping when transitioning to single-threaded state * Alexey Izbyshev <izbyshev@...ras.ru> [2022-10-04 08:16:16 +0300]: > On 2022-10-04 00:27, Szabolcs Nagy wrote: > > * Szabolcs Nagy <nsz@...t70.net> [2022-10-03 15:26:15 +0200]: > > i think i was wrong and you are right. > > > > so with your suggested swap of UNLOCK(killlock) and need_locks=-1 and > > starting with 'something == 0' the exiting E and remaining R threads: > > > > E:something=1 // protected by killlock > > E:UNLOCK(killlock) > > E:need_locks=-1 > > > > R:LOCK(unrelated) // reads need_locks == -1 > > R:need_locks=0 > > R:UNLOCK(unrelated) > > R:LOCK(killlock) // does not lock > > R:read something // can it be 0 ? > > > > and here something can be 0 (ie. not protected by killlock) on aarch64 > > because > > > > T1 > > something=1 > > ldaxr ... killlock > > stlxr ... killlock > > need_locks=-1 > > > > T2 > > x=need_locks > > ldaxr ... unrelated > > stlxr ... unrelated > > y=something > > > > can end with x==-1 and y==0. > > > Yes, overall this matches my understanding. However, your UNLOCK expansion > (in T1/T2) omits the branch instruction between stlxr and the following > store, and, as I mentioned, I'm not sufficiently knowledgeable to understand > the effects of this branch on the order of visibility of "stlxr killlock" > (and preceding stores) and "need_locks=-1". i don't know the answer, but i think in musl we don't want to rely on control dependcies in the architectural memory model anyway (in some cases the compiler can break it and it's hard to reason about). > > > and to fix it, both a_fetch_add and a_cas need an a_barrier. > > > > i need to think how to support such lock usage on aarch64 > > without adding too many dmb. > > > > I'd also like to point out that our discussion so far has been focused on > reordering of "something" protected by the critical section and > "need_locks=-1", but my original bug report also mentioned the possibility > of lock corruption, and for the latter we're also interested in reordering > of "stlxr" and "need_locks=-1". It's possible that some UNLOCK > implementations can provide the first guarantee, but not the second. i skipped that because i did not fully understand it and thought it's fixed once we make the atomics primitives full barriers. thanks. > > > > > > > > > > > memory operations in the critical section cannot visibly happen > > > after the > > > final stlxr. > > > > > > memory operations after the critical section cannot visibly happen > > > before > > > the ldaxr of UNLOCK. > > > > > > the only issue can be inside the ll/sc loop in UNLOCK if there are > > > independent > > > memory operations there, but there arent. > > > > You've already said this is wrong, but I want to spell out why it's so for > any interested parties to hopefully clear some confusion. Both premises > (about stlxr and ldaxr) are correct, but the third point doesn't follow and > is wrong. Because ldaxr/stlxr act as one-way barriers, the following > pseudo-code > > r1 = X > ldaxr LOCK > stlxr LOCK > r2 = Y > > can be observed, in my understanding of AArch64 memory model, as > > ldaxr LOCK > r1 = X > r2 = Y > stlxr LOCK > > and even as > > ldaxr LOCK > r2 = Y > r1 = X > stlxr LOCK > > The same is true for stores as well. > > So effectively the loop in UNLOCK can absorb memory operations from both > directions, which then can be reordered with each other (with a possible > exception of stores that follow UNLOCK, as I said above). > > Thanks, > Alexey
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.