|
Message-ID: <CAHk-=wiOS4Fi2tsXQrvLOiW69g4HiJYsqL6RPeTd14b4+2-Ykg@mail.gmail.com> Date: Thu, 2 Apr 2020 11:06:02 -0700 From: Linus Torvalds <torvalds@...ux-foundation.org> To: "Eric W. Biederman" <ebiederm@...ssion.com> Cc: Jann Horn <jannh@...gle.com>, Alan Stern <stern@...land.harvard.edu>, Andrea Parri <parri.andrea@...il.com>, Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Boqun Feng <boqun.feng@...il.com>, Nicholas Piggin <npiggin@...il.com>, David Howells <dhowells@...hat.com>, Jade Alglave <j.alglave@....ac.uk>, Luc Maranget <luc.maranget@...ia.fr>, "Paul E. McKenney" <paulmck@...nel.org>, Akira Yokosawa <akiyks@...il.com>, Daniel Lustig <dlustig@...dia.com>, Adam Zabrocki <pi3@....com.pl>, kernel list <linux-kernel@...r.kernel.org>, Kernel Hardening <kernel-hardening@...ts.openwall.com>, Oleg Nesterov <oleg@...hat.com>, Andy Lutomirski <luto@...capital.net>, Bernd Edlinger <bernd.edlinger@...mail.de>, Kees Cook <keescook@...omium.org>, Andrew Morton <akpm@...ux-foundation.org>, stable <stable@...r.kernel.org>, Marco Elver <elver@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>, kasan-dev <kasan-dev@...glegroups.com> Subject: Re: [PATCH] signal: Extend exec_id to 64bits On Thu, Apr 2, 2020 at 6:14 AM Eric W. Biederman <ebiederm@...ssion.com> wrote: > > Linus Torvalds <torvalds@...ux-foundation.org> writes: > > > tasklist_lock is aboue the hottest lock there is in all of the kernel. > > Do you know code paths you see tasklist_lock being hot? It's generally not bad enough to show up on single-socket machines. But the problem with tasklist_lock is that it's one of our remaining completely global locks. So it scales like sh*t in some circumstances. On single-socket machines, most of the truly nasty hot paths aren't a huge problem, because they tend to be mostly readers. So you get the cacheline bounce, but you don't (usually) get much busy looping. The cacheline bounce is "almost free" on a single socket. But because it's one of those completely global locks, on big multi-socket machines people have reported it as a problem forever. Even just readers can cause problems (because of the cacheline bouncing even when you just do the reader increment), but you also end up having more issues with writers scaling badly. Don't get me wrong - you can get bad scaling on other locks too, even when they aren't really global - we had that with just the reference counter increment for the user signal accounting, after all. Neither of the reference counts were actually global, but they were just effectively single counters under that particular load (ie the count was per-user, but the load ran as a single user). The reason tasklist_lock probably doesn't come up very much is that it's _always_ been expensive. It has also caused some fundamental issues (I think it's the main reason we have that rule that reader-writer locks are unfair to readers, because we have readers from interrupt context too, but can't afford to make normal readers disable interrupts). A lot of the tasklist lock readers end up looping quite a bit inside the lock (looping over threads etc), which is why it can then be a big deal when the rare reader shows up. We've improved a _lot_ of those loops. That has definitely helped for the common cases. But we've never been able to really fix the lock itself. Linus
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.