Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0fa7d888-c4fd-aeb3-db08-151ea648558d@linux.microsoft.com>
Date: Sun, 2 Aug 2020 17:58:59 -0500
From: "Madhavan T. Venkataraman" <madvenka@...ux.microsoft.com>
To: Andy Lutomirski <luto@...nel.org>
Cc: Kernel Hardening <kernel-hardening@...ts.openwall.com>,
 Linux API <linux-api@...r.kernel.org>,
 linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
 Linux FS Devel <linux-fsdevel@...r.kernel.org>,
 linux-integrity <linux-integrity@...r.kernel.org>,
 LKML <linux-kernel@...r.kernel.org>,
 LSM List <linux-security-module@...r.kernel.org>,
 Oleg Nesterov <oleg@...hat.com>, X86 ML <x86@...nel.org>
Subject: Re: [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor



On 8/2/20 3:00 PM, Andy Lutomirski wrote:
> On Sun, Aug 2, 2020 at 11:54 AM Madhavan T. Venkataraman
> <madvenka@...ux.microsoft.com> wrote:
>> More responses inline..
>>
>> On 7/28/20 12:31 PM, Andy Lutomirski wrote:
>>>> On Jul 28, 2020, at 6:11 AM, madvenka@...ux.microsoft.com wrote:
>>>>
>>>> From: "Madhavan T. Venkataraman" <madvenka@...ux.microsoft.com>
>>>>
>>> 2. Use existing kernel functionality.  Raise a signal, modify the
>>> state, and return from the signal.  This is very flexible and may not
>>> be all that much slower than trampfd.
>> Let me understand this. You are saying that the trampoline code
>> would raise a signal and, in the signal handler, set up the context
>> so that when the signal handler returns, we end up in the target
>> function with the context correctly set up. And, this trampoline code
>> can be generated statically at build time so that there are no
>> security issues using it.
>>
>> Have I understood your suggestion correctly?
> yes.
>
>> So, my argument would be that this would always incur the overhead
>> of a trip to the kernel. I think twice the overhead if I am not mistaken.
>> With trampfd, we can have the kernel generate the code so that there
>> is no performance penalty at all.
> I feel like trampfd is too poorly defined at this point to evaluate.
> There are three general things it could do.  It could generate actual
> code that varies by instance.  It could have static code that does not
> vary.  And it could actually involve a kernel entry.
>
> If it involves a kernel entry, then it's slow.  Maybe this is okay for
> some use cases.

Yes. IMO, it is OK for most cases except where dynamic code
is used specifically for enhancing performance such as interpreters
using JIT code for frequently executed sequences and dynamic
binary translation.

> If it involves only static code, I see no good reason that it should
> be in the kernel.

It does not involve only static code. This is meant for dynamic code.
However, see below.

> If it involves dynamic code, then I think it needs a clearly defined
> use case that actually requires dynamic code.

Fair enough. I will work on this and get back to you. This might take
a little time. So, bear with me.

But I would like to make one point here. There are many applications
and libraries out there that use trampolines. They may all require the
same sort of things:

    - set register context
    - push stuff on stack
    - jump to a target PC

But in each case, the context would be different:

    - only register context
    - only stack context
    - both register and stack contexts
    - different registers
    - different values pushed on the stack
    - different target PCs

If we had to do this purely at user level, each application/library would
need to roll its own solution, the solution has to be implemented for
each supported architecture and maintained. While the code is static
in each separate case, it is dynamic across all of them.

That is, the kernel will generate the code on the fly for each trampoline
instance based on its current context. It will not maintain any static
trampoline code at all.

Basically, it will supply the context to an arch-specific function and say:

    - generate instructions for loading these regs with these values
    - generate instructions to push these values on the stack
    - generate an instruction to jump to this target PC

It will place all of those generated instructions on a page and return the address.

So, even with the static case, there is a lot of value in the kernel providing
this. Plus, it has the framework to handle dynamic code.

>> Also, signals are asynchronous. So, they are vulnerable to race conditions.
>> To prevent other signals from coming in while handling the raised signal,
>> we would need to block and unblock signals. This will cause more
>> overhead.
> If you're worried about raise() racing against signals from out of
> thread, you have bigger problems to deal with.

Agreed. The signal blocking is just one example of problems related
to signals. There are other bigger problems as well. So, let us remove
the signal-based approach from our discussions.

Thanks.

Madhavan

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.