kernel-hardening - Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXgabhubUXMiTP5AgSASMkkzG+bYFaSoPt52QBZLm-PVg@mail.gmail.com>
Date: Fri, 6 Jan 2017 15:39:13 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Thomas Garnier <thgarnie@...gle.com>
Cc: Andy Lutomirski <luto@...nel.org>, Ingo Molnar <mingo@...nel.org>, 
	Arjan van de Ven <arjan@...ux.intel.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, "H . Peter Anvin" <hpa@...or.com>, Kees Cook <keescook@...omium.org>, 
	Borislav Petkov <bp@...en8.de>, Dave Hansen <dave@...1.net>, Chen Yucong <slaoub@...il.com>, 
	Paul Gortmaker <paul.gortmaker@...driver.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	Masahiro Yamada <yamada.masahiro@...ionext.com>, 
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Anna-Maria Gleixner <anna-maria@...utronix.de>, 
	Boris Ostrovsky <boris.ostrovsky@...cle.com>, Rasmus Villemoes <linux@...musvillemoes.dk>, 
	Michael Ellerman <mpe@...erman.id.au>, Juergen Gross <jgross@...e.com>, 
	Richard Weinberger <richard@....at>, X86 ML <x86@...nel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, 
	"kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>
Subject: Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

On Fri, Jan 6, 2017 at 2:54 PM, Thomas Garnier <thgarnie@...gle.com> wrote:
> On Fri, Jan 6, 2017 at 1:59 PM, Andy Lutomirski <luto@...nel.org> wrote:
>> On Fri, Jan 6, 2017 at 10:03 AM, Thomas Garnier <thgarnie@...gle.com> wrote:
>>> On Thu, Jan 5, 2017 at 10:49 PM, Ingo Molnar <mingo@...nel.org> wrote:
>>>>
>>>> * Thomas Garnier <thgarnie@...gle.com> wrote:
>>>>
>>>>> >> Not sure I fully understood and I don't want to miss an important point. Do
>>>>> >> you mean making GDT (remapping and per-cpu) read-only and switch the
>>>>> >> writeable flag only when we write to the per-cpu entry?
>>>>> >
>>>>> > What I mean is: write to the GDT through normal percpu access (or whatever the
>>>>> > normal mapping is) but load a read-only alias into the GDT register.  As long
>>>>> > as nothing ever tries to write through the GDTR alias, no page faults will be
>>>>> > generated.  So we just need to make sure that nothing ever writes to it
>>>>> > through GDTR.  AFAIK the only reason the CPU ever writes to the address in
>>>>> > GDTR is to set an accessed bit.
>>>>>
>>>>> A write is made when we use load_TR_desc (ltr). I didn't see any other yet.
>>>>
>>>> Is this write to the GDT, generated by the LTR instruction, done unconditionally
>>>> by the hardware?
>>>>
>>>
>>> That was my experience. I didn't look into details. Do you think we
>>> could change something so that ltr never writes to the GDT? (just mark
>>> the TSS entry busy).
>>
>> No, and I had the way this worked on 64-bit wrong.  LTR requires an
>> available TSS and changes it to busy.  So here are my thoughts on how
>> this should work:
>>
>> Let's get rid of any connection between this code and KASLR.  Every
>> time KASLR makes something work differently, a kitten turns all
>> Schrödinger on us.  This is moving the GDT to the fixmap, plain and
>> simple.  For now, make it one page per CPU and don't worry about the
>> GDT limit.
>
> I am all for this change but that's more significant.
>
> Ingo: What do you think about that?
>
>>
>> On 32-bit, we're going to have to make the fixmap GDT be read-write
>> because making it read-only will break double-fault handling.
>>
>> On 64-bit, we can use your trick of temporarily mapping the GDT
>> read-write every time we load TR, which should happen very rarely.
>> Alternatively, we can reload the *GDT* every time we reload TR, which
>> should be comparably slow.  This is going to regress performance in
>> the extremely rare case where KVM exits to a process that uses
>> ioperm() (I think), but I doubt anyone cares.  Or maybe we could
>> arrange to never reload TR when GDT points at the fixmap by having KVM
>> set the host GDT to the direct version and letting KVM's code to
>> reload the GDT switch to the fixmap copy.
>>
>> If we need a quirk to keep the fixmap copy read-write, so be it.
>>
>> None of this should depend on KASLR.  IMO it should happen unconditionally.
>>
>
> I looked back at the fixmap, and I can see a way it could be done
> (using NR_CPUS) like the other fixmap ranges. It would limit the
> number of cpus to 512 (there is 2M memory left on fixmap on the
> default configuration). That's if we never add any other fixmap on
> x64. I don't know if it is an acceptable number and if the fixmap
> region could be increased. (128 if we do your kvm trick, of course).
>

IIRC we need 4096 CPUs.  But that 2M limit seems eminently fixable.  I
just tried sticking 4096 pages of nothing right near the top of the
fixmap and the only problem I saw was that I had to move MODULES_END
down a little bit.

--Andy

P.S. Let's do the move to the fixmap, read/write as a separate patch.
That will make bisecting much easier.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.