|
Message-ID: <CA+55aFx480bxx7VAmFqdsVGHjoSav4eCvVpcx5ZSpBQuq+=1Mw@mail.gmail.com> Date: Wed, 22 Jun 2016 23:02:21 -0700 From: Linus Torvalds <torvalds@...ux-foundation.org> To: Andy Lutomirski <luto@...capital.net>, Oleg Nesterov <oleg@...hat.com> Cc: Andy Lutomirski <luto@...nel.org>, "the arch/x86 maintainers" <x86@...nel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>, Borislav Petkov <bp@...en8.de>, Nadav Amit <nadav.amit@...il.com>, Kees Cook <keescook@...omium.org>, Brian Gerst <brgerst@...il.com>, "kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>, Josh Poimboeuf <jpoimboe@...hat.com>, Jann Horn <jann@...jh.net>, Heiko Carstens <heiko.carstens@...ibm.com> Subject: Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core) On Wed, Jun 22, 2016 at 6:22 PM, Andy Lutomirski <luto@...capital.net> wrote: > > I implemented a percpu cache, and it's useless. > > When a task goes away, one reference is held until the next RCU grace > period so that task_struct can be used under RCU (look for > delayed_put_task_struct). Yeah, that RCU batching will screw the cache idea. But isn't it only the "task_struct" that needs that? That's a separate allocation from the stack, which contains the "thread_info". I think that what we *could* do is re-use the tread-info within the RCU grace period, as long as we delay freeing the task_struct. Yes, yes, we currently tie the task_struct and thread_info lifetimes together very tightly, but that's a historical thing rather than a requirement. We do the account_kernel_stack(tsk->stack, -1); arch_release_thread_info(tsk->stack); free_thread_info(tsk->stack); in free_task(), but I could imagine doing it earlier, and independently of the RCU-delayed free. In fact, I think we just do that at exit() time synchronously. The reference counting of the task_struct() is because a lot of other threads can have references to the exiting thread (and we have the tasklist and thread lists that are RCU-traversed), but none of those other references should ever look at the stack. Or even the thread-info. Hmm. I bet it would show some problems, but not be technically impossible. Especially if we make the thread-info rules be like the SLAB_DESTROY_BY_RCU semantics - the allocation may be re-used during the RCU grace period, but it is going to still exists and be of the same type. This sounds very much like something for Oleg Nesterov. Oleg, what do you think? Would it be reasonable to free the stack and thread_info synchronously at exit time, clear the pointer (to catch any odd use), and only RCU-delay the task_struct itself? That is, after all, what we already do with the VM, semaphores, files, fs info etc. There's no real reason I see to keep the stack around. (Obviously, we can't release it in do_exit() itself like we do some of the other state - it would need to be released after we've scheduled away to another process' stack, but we already have that TASK_DEAD handling in finish_task_switch for this exact reason). Linus
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.