Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pnejf6fz.fsf@x220.int.ebiederm.org>
Date: Wed, 12 Feb 2020 22:37:52 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Al Viro <viro@...iv.linux.org.uk>,  LKML <linux-kernel@...r.kernel.org>,  Kernel Hardening <kernel-hardening@...ts.openwall.com>,  Linux API <linux-api@...r.kernel.org>,  Linux FS Devel <linux-fsdevel@...r.kernel.org>,  Linux Security Module <linux-security-module@...r.kernel.org>,  Akinobu Mita <akinobu.mita@...il.com>,  Alexey Dobriyan <adobriyan@...il.com>,  Andrew Morton <akpm@...ux-foundation.org>,  Andy Lutomirski <luto@...nel.org>,  Daniel Micay <danielmicay@...il.com>,  Djalal Harouni <tixxdz@...il.com>,  "Dmitry V . Levin" <ldv@...linux.org>,  Greg Kroah-Hartman <gregkh@...uxfoundation.org>,  Ingo Molnar <mingo@...nel.org>,  "J . Bruce Fields" <bfields@...ldses.org>,  Jeff Layton <jlayton@...chiereds.net>,  Jonathan Corbet <corbet@....net>,  Kees Cook <keescook@...omium.org>,  Oleg Nesterov <oleg@...hat.com>,  Solar Designer <solar@...nwall.com>
Subject: Re: [PATCH v8 07/11] proc: flush task dcache entries from all procfs instances

Linus Torvalds <torvalds@...ux-foundation.org> writes:

> On Wed, Feb 12, 2020 at 1:48 PM Eric W. Biederman <ebiederm@...ssion.com> wrote:
>>
>> The good news is proc_flush_task isn't exactly called from process exit.
>> proc_flush_task is called during zombie clean up. AKA release_task.
>
> Yeah, that at least avoids some of the nasty locking while dying debug problems.
>
> But the one I was more worried about was actually the lock contention
> issue with lots of processes. The lock is basically a single global
> lock in many situations - yes, it's technically per-ns, but in a lot
> of cases you really only have one namespace anyway.
>
> And we've had problems with global locks in this area before, notably
> the one you call out:
>
>> Further after proc_flush_task does it's thing the code goes
>> and does "write_lock_irq(&task_list_lock);"
>
> Yeah, so it's not introducing a new issue, but it is potentially
> making something we already know is bad even worse.
>
>> What would be downside of having a mutex for a list of proc superblocks?
>> A mutex that is taken for both reading and writing the list.
>
> That's what the original patch actually was, and I was hoping we could
> avoid that thing.
>
> An rwsem would be possibly better, since most cases by far are likely
> about reading.
>
> And yes, I'm very aware of the task_list_lock, but it's literally why
> I don't want to make a new one.
>
> I'm _hoping_ we can some day come up with something better than
> task_list_lock.

Yes.  I understand that.

I occassionally play with ideas, and converted all of proc to rcu
to help with situation but I haven't come up with anything clearly
better.


All of this is why I was really hoping we could have a change in
strategy and see if we can make the shrinker be able to better prune
proc inodes.



I think I have an alternate idea that could work.  Add some extra code
into proc_task_readdir, that would look for dentries that no longer
point to tasks and d_invalidate them.  With the same logic probably
being called from a few more places as well like proc_pid_readdir,
proc_task_lookup, and proc_pid_lookup.

We could even optimize it and have a process died flag we set in the
superblock.

That would would batch up the freeing work until the next time someone
reads from proc in a way that would create more dentries.  So it would
prevent dentries from reaped zombies from growing without bound.

Hmm.  Given the existence of proc_fill_cache it would really be a good
idea if readdir and lookup performed some of the freeing work as well.
As on readdir we always populate the dcache for all of the directory
entries.

I am booked solid for the next little while but if no one beats me to it
I will try and code something like that up where at least readdir
looks for and invalidates stale dentries.

Eric

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.