Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202103211128.B59FEB91F@keescook>
Date: Sun, 21 Mar 2021 11:45:59 -0700
From: Kees Cook <keescook@...omium.org>
To: John Wood <john.wood@....com>
Cc: Jann Horn <jannh@...gle.com>, Randy Dunlap <rdunlap@...radead.org>,
	Jonathan Corbet <corbet@....net>, James Morris <jmorris@...ei.org>,
	Shuah Khan <shuah@...nel.org>, "Serge E. Hallyn" <serge@...lyn.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Andi Kleen <ak@...ux.intel.com>,
	kernel test robot <oliver.sang@...el.com>,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-security-module@...r.kernel.org,
	linux-kselftest@...r.kernel.org,
	kernel-hardening@...ts.openwall.com
Subject: Re: [PATCH v6 3/8] securtiy/brute: Detect a brute force attack

On Sun, Mar 21, 2021 at 04:01:18PM +0100, John Wood wrote:
> On Wed, Mar 17, 2021 at 07:57:10PM -0700, Kees Cook wrote:
> > On Sun, Mar 07, 2021 at 12:30:26PM +0100, John Wood wrote:
> > > +static u64 brute_update_crash_period(struct brute_stats *stats, u64 now)
> > > +{
> > > +	u64 current_period;
> > > +	u64 last_crash_timestamp;
> > > +
> > > +	spin_lock(&stats->lock);
> > > +	current_period = now - stats->jiffies;
> > > +	last_crash_timestamp = stats->jiffies;
> > > +	stats->jiffies = now;
> > > +
> > > +	stats->period -= brute_mul_by_ema_weight(stats->period);
> > > +	stats->period += brute_mul_by_ema_weight(current_period);
> > > +
> > > +	if (stats->faults < BRUTE_MAX_FAULTS)
> > > +		stats->faults += 1;
> > > +
> > > +	spin_unlock(&stats->lock);
> > > +	return last_crash_timestamp;
> > > +}
> >
> > Now *here* locking makes sense, and it only needs to be per-stat, not
> > global, since multiple processes may be operating on the same stat
> > struct. To make this more no-reader-locking-friendly, I'd also update
> > everything at the end, and use WRITE_ONCE():
> >
> > 	u64 current_period, period;
> > 	u64 last_crash_timestamp;
> > 	u64 faults;
> >
> > 	spin_lock(&stats->lock);
> > 	current_period = now - stats->jiffies;
> > 	last_crash_timestamp = stats->jiffies;
> >
> > 	WRITE_ONCE(stats->period,
> > 		   stats->period - brute_mul_by_ema_weight(stats->period) +
> > 		   brute_mul_by_ema_weight(current_period));
> >
> > 	if (stats->faults < BRUTE_MAX_FAULTS)
> > 		WRITE_ONCE(stats->faults, stats->faults + 1);
> >
> > 	WRITE_ONCE(stats->jiffies, now);
> >
> > 	spin_unlock(&stats->lock);
> > 	return last_crash_timestamp;
> >
> > That way readers can (IIUC) safely use READ_ONCE() on jiffies and faults
> > without needing to hold the &stats->lock (unless they need perfectly matching
> > jiffies, period, and faults).
> 
> Sorry, but I try to understand how to use locking properly without luck.
> 
> I have read (and tried to understand):
>    tools/memory-model/Documentation/simple.txt
>    tools/memory-model/Documentation/ordering.txt
>    tools/memory-model/Documentation/recipes.txt
>    Documentation/memory-barriers.txt
> 
> And I don't find the responses that I need. I'm not saying they aren't
> there but I don't see them. So my questions:
> 
> If in the above function makes sense to use locking, and it is called from
> the brute_task_fatal_signal hook, then, all the functions that are called
> from this hook need locking (more than one process can access stats at the
> same time).
> 
> So, as you point, how it is possible and safe to read jiffies and faults
> (and I think period even though you not mention it) using READ_ONCE() but
> without holding brute_stats::lock? I'm very confused.

There are, I think, 3 considerations:

- is "stats", itself, a valid allocation in kernel memory? This is the
  "lifetime" management of the structure: it will only stay allocated as
  long as there is a task still alive that is attached to it. The use of
  refcount_t on task creation/death should entirely solve this issue, so
  that all the other places where you access "stats", the memory will be
  valid. AFAICT, this one is fine: you're doing all the correct lifetime
  management.

- changing a task's stats pointer: this is related to lifetime
  management, but it, I think, entirely solved by the existing
  refcounting. (And isn't helped by holding stats->lock since this is
  about stats itself being a valid pointer.) Again, I think this is all
  correct already in your existing code (due to the implicit locking of
  "current"). Perhaps I've missed something here, but I guess we'll see!

- are the values in stats getting written by multiple writers, or read
  during a write, etc?

This last one is the core of what I think could be improved here:

To keep the writes serialized, you (correctly) perform locking in the
writers. This is fine.

There is also locking in the readers, which I think is not needed.
AFAICT, READ_ONCE() (with WRITE_ONCE() in the writers) is sufficient for
the readers here.

> IIUC (during the reading of the documentation) READ_ONCE and WRITE_ONCE only
> guarantees that a variable loaded with WRITE_ONCE can be read safely with
> READ_ONCE avoiding tearing, etc. So, I see these functions like a form of
> guarantee atomicity in variables.

Right -- from what I can see about how you're reading the statistics, I
don't see a way to have the values get confused (assuming locked writes
and READ/WRITE_ONCE()).

> Another question. Is it also safe to use WRITE_ONCE without holding the lock?
> Or this is only appliable to read operations?

No -- you'll still want the writer locked since you update multiple fields
in stats during a write, so you could miss increments, or interleave
count vs jiffies writes, etc. But the WRITE_ONCE() makes sure that the
READ_ONCE() readers will see a stable value (as I understand it), and
in the order they were written.

> Any light on this will help me to do the best job in the next patches. If
> somebody can point me to the right direction it would be greatly appreciated.
> 
> Is there any documentation for newbies regarding this theme? I'm stuck.
> I have also read the documentation about spinlocks, semaphores, mutex, etc..
> but nothing clears me the concept expose.
> 
> Apologies if this question has been answered in the past. But the search in
> the mailing list has not been lucky.

It's a complex subject! Here are some other docs that might help:

tools/memory-model/Documentation/explanation.txt
Documentation/core-api/refcount-vs-atomic.rst

or they may melt your brain further! :) I know mine is always mushy
after reading them.

> Thanks for your time and patience.

You're welcome; and thank you for your work on this! I've wanted a robust
brute force mitigation in the kernel for a long time. :)

-Kees

-- 
Kees Cook

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.