kernel-hardening - Re: Understanding the JIT Hardening feature

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <569CF813.5070306@iogearbox.net>
Date: Mon, 18 Jan 2016 15:34:59 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: elena.reshetova@...el.com
CC: kernel-hardening@...ts.openwall.com, 
 "Schaufler, Casey" <casey.schaufler@...el.com>,
 "Leibowitz, Michael" <michael.leibowitz@...el.com>
Subject: Re: Understanding the JIT Hardening feature

Hi Elena,

On 01/18/2016 10:22 AM, Reshetova, Elena wrote:
[...]
> I got to spend some time reading the 3.15 grsecurity patch with regards to
> JIT hardening feature and wanted to share my thoughts on what the patch was
> attempting to do.
>
> It seems that the bulk of changes was done in bpf_jit_compile() function
> (corresponds to the do_jit() 4.4 function).
>
> The way how it was done was to generate a random value (randkey =
> prandom_u32();) and then use that value to dilute (by xor with this value)
> the the four cases of operations:
>
> case BPF_S_ALU_MUL_K: /* A *= K */
>
> case BPF_S_ALU_MOD_K: /* A %= K; */
>
> case BPF_S_ALU_DIV_K: /* A /= K */
>
> case BPF_S_ALU_AND_X:

Hm, you mean BPF_S_ALU_AND_K I presume?

Yeah, that mitigation as implemented makes absolute sense to me. You blind at
least the 32bit constants away by xor with randkey.

[...]
> Another part of the patch was making changes to the bpf_alloc_binary()
> function. That part I don't really understand since it didn't seem to make
> any security improvements, but merely setting the header length to be 128 ad
> changing the bpf_binary_header pointer structure. If this change is to be
> moved to the latest kernel, modifications to the
> <http://lxr.free-electrons.com/ident?i=bpf_jit_binary_alloc>
> bpf_jit_binary_alloc() function (kernel/bpf/core.c) are needed as well as
> changes in  <http://lxr.free-electrons.com/ident?i=bpf_binary_header>
> bpf_binary_header pointer structure (include/linux/filter.h). But again, I
> do not understand what security improvements these changes make and why they
> were done at the first place.

I don't know the patch set deeply enough, but that part, I believe, is a PaX
specific simplification. It doesn't need the struct bpf_binary_header and can
thus remove this bit of code. The image is written with pax_open_kernel()/
pax_close_kernel() pair, and f.e. set_memory_ro() can be removed as PaX already
comes with native protections that handle such things.

> The last change from the patch was done in EMIT_COND_JMP() function (which
> in later kernels is included into emit_cond_jmp switch statement), which was
> adding a conditional jz into the flow based on the randkey value. It was
> affecting the following cases:
>
> case BPF_S_ALU_DIV_X: /* A /= X; */
> case BPF_S_ALU_MOD_X: /* A %= X; */
> case BPF_S_ANC_IFINDEX:

(... for the jumps in 'failure' cases.)

> as well as conditional branch.

Yeah, I think to catch bugs or issues with long jumps, so we effectively BUG
in case jump is off by few bytes. If you look at the kernel git history, there
was such a case/bug longer time ago that got fixed.

> The above doesn't seem to go very much in line with what Daniel suggested
> earlier:
>
> "We agreed that the way to go would be to try mitigating it on a BPF
> bytecode level iff feasible. For example, by expanding/rewriting things like
> loading constants into a i) load where the constant is xored with a (each
> time newly generated) prandom_u32()/.. value followed by ii) xor on the same
> reg with that prandom value itself."
>
> It is also very likely that I didn't understand what you mean Daniel. So
> some clarification questions:
>
> -          would you agree with the places where the original grsecurity
> patch attempts to add randomization or do you think places should be
> different?
>
> -          for the actual randomization, are you proposing to enhance it by
> not only xor to a prandom_u32() but also xor with the same reg? Could you
> explain what do you mean by this part?

To answer both, hopefully ...

I am simply saying that before we go and change every JIT there is in the
kernel, we should give it a try and rewrite BPF instructions first (so BPF
byte code, I mean). That way, JITs would /not/ need to be changed, and would
effectively emit something similar to the image /transparently/. So changes
need to be done in BPF core code instead of JITs. Underlying idea to blind
them out is the same, of course (/how/ you blind them is implementation detail).
If that path would not be feasible at all for some very good reasons, then we
can still look into changing the JITs themselves. If you look into the git
history of some of the JITs, there are some rather bad bugs that got fixed.
So, we should try hard to solve this on a generic level first, and if it
turns out to work well **, then we don't need to add further complexity to
each JIT eventually (x86 is just one, but there are arm, arm64, ppc, mips,
sparc, s390 as well).

  **: Also in terms of performance, what I mean by that is: iff due to all the
mitigation/complexity added, we turn out to become almost as slow as interpreter
case, then a disabling CONFIG_BPF_JIT in your config might simply be the better
mitigation option ;). I think we should still be faster, but it certainly
needs to be checked, too.

> I was also trying to find more info about how JIT code itself works, but
> wasn't able to find anything reasonable, so have to make all my statements
> from just reading the code, which turned out isn't the most easiest thing to
> understand for smbd not familiar with topic.  So, any pointers to the
> reading material, if exist, are very much appreciated.

Here's some rudimentary documentation on the topic in general (yeah, could
need some improvements):

   Documentation/networking/filter.txt

Cheers,
Daniel
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.