kernel-hardening - Re: Re: [PATCH] x86/refcount: Implement fast refcount

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170424092051.ooww4tngcu5274qz@hirez.programming.kicks-ass.net>
Date: Mon, 24 Apr 2017 11:20:51 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Jann Horn <jannh@...gle.com>
Cc: Kees Cook <keescook@...omium.org>, linux-kernel@...r.kernel.org,
	Eric Biggers <ebiggers3@...il.com>,
	Christoph Hellwig <hch@...radead.org>,
	"axboe@...nel.dk" <axboe@...nel.dk>,
	James Bottomley <James.Bottomley@...senpartnership.com>,
	Elena Reshetova <elena.reshetova@...el.com>,
	Hans Liljestrand <ishkamiel@...il.com>,
	David Windsor <dwindsor@...il.com>, x86@...nel.org,
	Ingo Molnar <mingo@...nel.org>, Arnd Bergmann <arnd@...db.de>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jann Horn <jann@...jh.net>, "David S. Miller" <davem@...emloft.net>,
	linux-arch@...r.kernel.org, kernel-hardening@...ts.openwall.com,
	PaX Team <pageexec@...email.hu>
Subject: Re: Re: [PATCH] x86/refcount: Implement fast
 refcount_t handling

On Mon, Apr 24, 2017 at 10:53:30AM +0200, Jann Horn wrote:
> On Mon, Apr 24, 2017 at 10:32 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Fri, Apr 21, 2017 at 03:09:39PM -0700, Kees Cook wrote:
> >> This patch ports the x86-specific atomic overflow handling from PaX's
> >> PAX_REFCOUNT to the upstream refcount_t API. This is an updated version
> >> from PaX that eliminates the saturation race condition by resetting the
> >> atomic counter back to the INT_MAX saturation value on both overflow and
> >> underflow. To win a race, a system would have to have INT_MAX threads
> >> simultaneously overflow before the saturation handler runs.
> >
> > And is this impossible? Highly unlikely I'll grant you, but absolutely
> > impossible?
> >
> > Also, you forgot nr_cpus in your bound. Afaict the worst case here is
> > O(nr_tasks + 3*nr_cpus).
> >
> > Because PaX does it, is not a correctness argument. And this really
> > wants one.
> 
> From include/linux/threads.h:
> 
> /*
>  * A maximum of 4 million PIDs should be enough for a while.
>  * [NOTE: PID/TIDs are limited to 2^29 ~= 500+ million, see futex.h.]
>  */
> #define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \
> (sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT))
> 
> AFAICS that means you can only have up to 2^22 running tasks at
> a time, since every running task requires a PID in the init pid namespace.

That's but an artificial limit and bound to be increased sooner rather
than later, given the trend in building ever larger machines.

If we go with the futex limit on the full tid space, we end up with 2^30
(the comment here about 29 is wrong, see FUTEX_TID_MASK).

We then still have to add a multiple of nr_cpus. Granted, that will
still be slightly short of 3^31, but not to any amount I'm really
comfortable with, we're _really_ close.

Point remains though; Changelog (and arguably the code itself) should
very much include such bits.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.