kernel-hardening - Re: [PATCH 10/17] prmem: documentation

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXNckT9W288VXx-6inO5qYn-6dUPocKGcBT0GCO3xi3ZQ@mail.gmail.com>
Date: Tue, 13 Nov 2018 09:16:09 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Igor Stoppa <igor.stoppa@...il.com>
Cc: Kees Cook <keescook@...omium.org>, Peter Zijlstra <peterz@...radead.org>, 
	Nadav Amit <nadav.amit@...il.com>, Mimi Zohar <zohar@...ux.vnet.ibm.com>, 
	Matthew Wilcox <willy@...radead.org>, Dave Chinner <david@...morbit.com>, 
	James Morris <jmorris@...ei.org>, Michal Hocko <mhocko@...nel.org>, 
	Kernel Hardening <kernel-hardening@...ts.openwall.com>, 
	linux-integrity <linux-integrity@...r.kernel.org>, 
	LSM List <linux-security-module@...r.kernel.org>, 
	Igor Stoppa <igor.stoppa@...wei.com>, Dave Hansen <dave.hansen@...ux.intel.com>, 
	Jonathan Corbet <corbet@....net>, Laura Abbott <labbott@...hat.com>, 
	Randy Dunlap <rdunlap@...radead.org>, Mike Rapoport <rppt@...ux.vnet.ibm.com>, 
	"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, 
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 10/17] prmem: documentation

On Tue, Nov 13, 2018 at 6:25 AM Igor Stoppa <igor.stoppa@...il.com> wrote:
>
> Hello,
> I've been studying v4 of the patch-set [1] that Nadav has been working on.
> Incidentally, I think it would be useful to cc also the
> security/hardening ml.
> The patch-set seems to be close to final, so I am resuming this discussion.
>
> On 30/10/2018 19:06, Andy Lutomirski wrote:
>
> > I support the addition of a rare-write mechanism to the upstream kernel.  And I think that there is only one sane way to implement it: using an mm_struct. That mm_struct, just like any sane mm_struct, should only differ from init_mm in that it has extra mappings in the *user* region.
>
> After reading the code, I see what you meant.
> I think I can work with it.
>
> But I have a couple of questions wrt the use of this mechanism, in the
> context of write rare.
>
>
> 1) mm_struct.
>
> Iiuc, the purpose of the patchset is mostly (only?) to patch kernel code
> (live patch?), which seems to happen sequentially and in a relatively
> standardized way, like replacing the NOPs specifically placed in the
> functions that need patching.
>
> This is a bit different from the more generic write-rare case, applied
> to data.
>
> As example, I have in mind a system where both IMA and SELinux are in use.
>
> In this system, a file is accessed for the first time.
>
> That would trigger 2 things:
> - evaluation of the SELinux rules and probably update of the AVC cache
> - IMA measurement and update of the measurements
>
> Both of them could be write protected, meaning that they would both have
> to be modified through the write rare mechanism.
>
> While the events, for 1 specific file, would be sequential, it's not
> difficult to imagine that multiple files could be accessed at the same time.
>
> If the update of the data structures in both IMA and SELinux must use
> the same mm_struct, that would have to be somehow regulated and it would
> introduce an unnecessary (imho) dependency.
>
> How about having one mm_struct for each writer (core or thread)?
>

I don't think that helps anything.  I think the mm_struct used for
prmem (or rare_write or whatever you want to call it) should be
entirely abstracted away by an appropriate API, so neither SELinux nor
IMA need to be aware that there's an mm_struct involved.  It's also
entirely possible that some architectures won't even use an mm_struct
behind the scenes -- x86, for example, could have avoided it if there
were a kernel equivalent of PKRU.  Sadly, there isn't.

>
>
> 2) Iiuc, the purpose of the 2 pages being remapped is that the target of
> the patch might spill across the page boundary, however if I deal with
> the modification of generic data, I shouldn't (shouldn't I?) assume that
> the data will not span across multiple pages.

The reason for the particular architecture of text_poke() is to avoid
memory allocation to get it working.  i think that prmem/rare_write
should have each rare-writable kernel address map to a unique user
address, possibly just by offsetting everything by a constant.  For
rare_write, you don't actually need it to work as such until fairly
late in boot, since the rare_writable data will just be writable early
on.

>
> If the data spans across multiple pages, in unknown amount, I suppose
> that I should not keep interrupts disabled for an unknown time, as it
> would hurt preemption.
>
> What I thought, in my initial patch-set, was to iterate over each page
> that must be written to, in a loop, re-enabling interrupts in-between
> iterations, to give pending interrupts a chance to be served.
>
> This would mean that the data being written to would not be consistent,
> but it's a problem that would have to be addressed anyways, since it can
> be still read by other cores, while the write is ongoing.

This probably makes sense, except that enabling and disabling
interrupts means you also need to restore the original mm_struct (most
likely), which is slow.  I don't think there's a generic way to check
whether in interrupt is pending without turning interrupts on.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.