Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190329023456.GB194158@google.com>
Date: Thu, 28 Mar 2019 22:34:56 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Jann Horn <jannh@...gle.com>, Kees Cook <keescook@...omium.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Android Kernel Team <kernel-team@...roid.com>,
	Kernel Hardening <kernel-hardening@...ts.openwall.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Matthew Wilcox <willy@...radead.org>,
	Michal Hocko <mhocko@...e.com>,
	"Reshetova, Elena" <elena.reshetova@...el.com>
Subject: Re: [PATCH] Convert struct pid count to refcount_t

On Thu, Mar 28, 2019 at 10:39:58AM -0400, Joel Fernandes wrote:
> On Thu, Mar 28, 2019 at 03:26:19PM +0100, Oleg Nesterov wrote:
> > On 03/27, Joel Fernandes wrote:
> > >
> > > Also, based on Kees comment, I think it appears to me that get_pid and
> > > put_pid can race in this way in the original code right?
> > >
> > > get_pid			put_pid
> > >
> > > 			atomic_dec_and_test returns 1
> > > atomic_inc
> > > 			kfree
> > >
> > > deref pid /* boom */
> > > -------------------------------------------------
> > >
> > > I think get_pid needs to call atomic_inc_not_zero()
> > 
> > No.
> > 
> > get_pid() should only be used if you already have a reference or you do
> > something like
> > 
> > 	rcu_read_lock();
> > 	pid = find_vpid();
> > 	get_pid();
> > 	rcu_read_lock();
> > 
> > in this case we rely on call_rcu(delayed_put_pid) which drops the initial
> > reference.
> > 
> > If put_pid() sees pid->count == 1, then a) nobody else has a reference and
> > b) nobody else can find this pid on rcu-protected lists, so it is safe to
> > free it.
> 
> I agree. Check my reply to Jann, I already replied to him about this. thanks!
> 

Also Oleg, why not just call refcount_dec_and_test like below? If count is 1,
then it will decrement to 0 and return true anyway. Is this because we want
to avoid writes at the cost of more reads? Did I miss something? Thank you.

I don't remember very clearly, but I think Kees also asked about the same thing.

diff --git a/kernel/pid.c b/kernel/pid.c
index 2095c7da644d..89c4849fab5d 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -106,8 +106,7 @@ void put_pid(struct pid *pid)
 		return;
 
 	ns = pid->numbers[pid->level].ns;
-	if ((refcount_read(&pid->count) == 1) ||
-	     refcount_dec_and_test(&pid->count)) {
+	if (refcount_dec_and_test(&pid->count)) {
 		kmem_cache_free(ns->pid_cachep, pid);
 		put_pid_ns(ns);
 	}

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.