Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+YcBeshE811w5KSyYpBqaQ3S_-aKanOGZcHCQvHWHc4Tg@mail.gmail.com>
Date: Mon, 11 Sep 2023 11:50:19 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Jann Horn <jannh@...gle.com>
Cc: Andrey Ryabinin <ryabinin.a.a@...il.com>, Christoph Lameter <cl@...ux.com>, 
	Pekka Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>, 
	Joonsoo Kim <iamjoonsoo.kim@....com>, Vlastimil Babka <vbabka@...e.cz>, 
	Alexander Potapenko <glider@...gle.com>, Andrey Konovalov <andreyknvl@...il.com>, 
	Vincenzo Frascino <vincenzo.frascino@....com>, Andrew Morton <akpm@...ux-foundation.org>, 
	Roman Gushchin <roman.gushchin@...ux.dev>, Hyeonggon Yoo <42.hyeyoo@...il.com>, 
	kasan-dev@...glegroups.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	linux-hardening@...r.kernel.org, kernel-hardening@...ts.openwall.com
Subject: Re: [PATCH] slub: Introduce CONFIG_SLUB_RCU_DEBUG

On Mon, 28 Aug 2023 at 16:40, Jann Horn <jannh@...gle.com> wrote:
>
> On Sat, Aug 26, 2023 at 5:32 AM Dmitry Vyukov <dvyukov@...gle.com> wrote:
> > On Fri, 25 Aug 2023 at 23:15, Jann Horn <jannh@...gle.com> wrote:
> > > Currently, KASAN is unable to catch use-after-free in SLAB_TYPESAFE_BY_RCU
> > > slabs because use-after-free is allowed within the RCU grace period by
> > > design.
> > >
> > > Add a SLUB debugging feature which RCU-delays every individual
> > > kmem_cache_free() before either actually freeing the object or handing it
> > > off to KASAN, and change KASAN to poison freed objects as normal when this
> > > option is enabled.
> > >
> > > Note that this creates a 16-byte unpoisoned area in the middle of the
> > > slab metadata area, which kinda sucks but seems to be necessary in order
> > > to be able to store an rcu_head in there without triggering an ASAN
> > > splat during RCU callback processing.
> >
> > Nice!
> >
> > Can't we unpoision this rcu_head right before call_rcu() and repoison
> > after receiving the callback?
>
> Yeah, I think that should work. It looks like currently
> kasan_unpoison() is exposed in include/linux/kasan.h but
> kasan_poison() is not, and its inline definition probably means I
> can't just move it out of mm/kasan/kasan.h into include/linux/kasan.h;
> do you have a preference for how I should handle this? Hmm, and it
> also looks like code outside of mm/kasan/ anyway wouldn't know what
> are valid values for the "value" argument to kasan_poison().
> I also have another feature idea that would also benefit from having
> something like kasan_poison() available in include/linux/kasan.h, so I
> would prefer that over adding another special-case function inside
> KASAN for poisoning this piece of slab metadata...
>
> I guess I could define a wrapper around kasan_poison() in
> mm/kasan/generic.c that uses a new poison value for "some other part
> of the kernel told us to poison this area", and then expose that
> wrapper with a declaration in include/mm/kasan.h? Something like:
>
> void kasan_poison_outline(const void *addr, size_t size, bool init)
> {
>   kasan_poison(addr, size, KASAN_CUSTOM, init);
> }

Looks reasonable.

> > What happens on cache destruction?
> > Currently we purge quarantine on cache destruction to be able to
> > safely destroy the cache. I suspect we may need to somehow purge rcu
> > callbacks as well, or do something else.
>
> Ooh, good point, I hadn't thought about that... currently
> shutdown_cache() assumes that all the objects have already been freed,
> then puts the kmem_cache on a list for
> slab_caches_to_rcu_destroy_workfn(), which then waits with an
> rcu_barrier() until the slab's pages are all gone.

I guess this is what the test robot found as well.

> Luckily kmem_cache_destroy() is already a sleepable operation, so
> maybe I should just slap another rcu_barrier() in there for builds
> with this config option enabled... I think that should be fine for an
> option mostly intended for debugging.

This is definitely the simplest option.
I am a bit concerned about performance if massive cache destruction
happens (e.g. maybe during destruction of a set of namespaces for a
container). Net namespace is slow to destroy for this reason IIRC,
there were some optimizations to batch rcu synchronization. And now we
are adding more.
But I don't see any reasonable faster option as well.
So I guess let's do this now and optimize later (or not).

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.