libc-coord - Re: getrandom via vDSO

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <fdce824f-1c82-4d67-82a8-fae543cbc6ae@linaro.org>
Date: Thu, 26 Sep 2024 10:09:07 -0300
From: Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
To: Schrodinger ZHU Yifan <i@...yi.fan>, libc-coord@...ts.openwall.com
Subject: Re: getrandom via vDSO

For BSD I don't think so, since all implementations I am aware of are based on the
original OpenBSD one, and it uses a global lock for multithread to handle the
global state.

The glibc one is based on getrandom(), but has some fallbacks to read /dev/random
or /dev/urandom if the syscall is not available (old kernels).  It is
async-signal-safe for the former, but the fallback open and close file descriptor
and async-signal-safeness is tricky in this case.

There is also another minor points, which I think it is an implementation detail,
but it might something you might consider on llvm-libc: on older glibc versions 
we used xor-encoding with some bytes for AT_SECURE on all internal function pointers
(such as the vDSO ones).  We changed by moving them to RELRO section, so 
initialization is done at program startup.  This is another things you can not
easily implement with a overlay design as well.


On 25/09/24 16:17, Schrodinger ZHU Yifan wrote:
> Does BSD/glibc promise the async-signal-safety on their arc4 implementations? 
> 
> 
> On Wed, Sep 25, 2024 at 13:06, Adhemerval Zanella Netto <adhemerval.zanella@...aro.org <mailto:On Wed, Sep 25, 2024 at 13:06, Adhemerval Zanella Netto <<a href=>> wrote:
>>
>>
>> On 23/09/24 18:18, Schrodinger ZHU Yifan wrote:
>> > Hi,
>> >
>> > It's Yifan, a regular committer to the LLVM-libc project.
>> >
>> > We have recently merged the vDSO functionality into our codebase so I am checking existing syscall wrappers to apply vDSO if related symbols are feasible.
>> >
>> > __vdso_getrandom/__kernel_getrandom has been recently introduced into the linux kernel, I wonder what should be a proper plan to utilize it. The additional userspace state structure looks a bit scary to me. It seems to me that the state requires proper maintenance across fork and clone, and has potential security implications. This reminds me of history of pid caching, so I am hesitating on whether I should go ahead to implement getrandom optimization when the vDSO symbols are available. (P.S. this time, kernel provides desired flags that can help us to keep the region safe to some extend: https://lore.kernel.org/linux-mm/20240703183115.1075219-2-Jason@zx2c4.com/T/ <https://lore.kernel.org/linux-mm/20240703183115.1075219-2-Jason@zx2c4.com/T/>)
>> >
>> > Given that glibc is already working on a patch, I wonder if we have consensus on the following questions:
>>
>> Keep in mind that the mostly of proposed glibc [1] implementation arise from the
>> implicit requirement to keep getrandom() async-signal-safe and fork handling.
>> This why we have used a mmap allocator for the state tracker itself (since glibc
>> malloc is not async-signal-safe).
>>
>> Both the vdso selftest [2] and a proposed Go implementation [3] uses a way simple
>> allocation scheme.
>>
>> If async-signal-safeness and fork handling is really required (as I saw on your
>> llvm-libc proposal [4]) you need to take care of some cases:
>>
>> 1. You need to handle function reentracy (for the SA_NODEFER case). The glibc
>> proposal handles by 'grabbing' the per-thread allocated buffer to a local
>> variable, and setting a sentinel value. A reentrant call will then check
>> the per-thread buffer and fallback to syscall if it is already 'taken'.
>>
>> 2. Even though the state allocation uses mmap and has the reentracy handling
>> above, for glibc we still need to handle of concurrent _Fork() (where one
>> thread calls where another thread is allocating a state). Since _Fork()
>> is suppose the be async-signal-safe, taking a lock if trick, and getrandom
>> is also expected to work on the child process.
>> We handle it with a hard hammer of blocking all signals (including internal
>> ones). We block even internal signal for avoid further issues, although I
>> think we can only block user-visible ones.
>>
>> 3. You can use either a maximum number of states or grow the state list as we
>> did. However, you need to keep in mind that reallocate the state list
>> in place (with mremap) is not fully fork-safe: if a fork()/_Fork() is called
>> concurrent just before the mremap returned buffer is write on the global
>> allocator, the program will see an inconsistent state.
>> We handle it with by reallocate the buffer with a new mmap call, writing the
>> state atomically, and unmap the old state tracker.
>>
>> 4. The thread exit state release must be done with the signal blocked, to avoid
>> creating a new free-state block during thread release. The glibc pthread
>> implementation does it to fix a potential race condition on pthread_kill [5].
>> You can not simply implement it with __cxa_thread_atexit_impl, since afaik
>> POSIX state that signal should not be blocked when the pthread implementation
>> call it.
>>
>> I think some of the requirements might be hard to implement with an overlay
>> libc design as llvm-libc (specially the item 4.).
>>
>> [1] https://patchwork.sourceware.org/project/glibc/patch/20240918140500.26365-2-Jason@zx2c4.com/
>> [2] https://github.com/torvalds/linux/blob/master/tools/testing/selftests/vDSO/vdso_test_getrandom.c
>> [3] https://go-review.googlesource.com/c/go/+/614835
>> [4] https://github.com/llvm/llvm-project/pull/109870
>> [5] https://sourceware.org/bugzilla/show_bug.cgi?id=12889
>>
>> >
>> > 1. Is there a clear performance demand such that it becomes necessarily to avoid syscall inside |getrandom|? 
>> > 2. Is it certain that such performance demand will pay off given all the additional complexity required to maintain per-thread/per-process state (with async-signal-safety)?
>> > 3. Is it clear that unwrapped clone/fork and other edge cases will not complicate the situation?
>> >
>>
>> Besides Florian points, the performance difference is really abysmal for small
>> buffer (like uint32_t or uint64_t), where syscall overhead dominates. Also,
>> the vgetrandom provides the same security guarantees as the syscall ones,
>> so we can be used a CSRNG (Jason can give you more details).
>>
>> So you need to balance the runtime requirements, along on where to deploy. Besides
>> the code complexity of state and opaque state management, the vDSO requires more
>> memory than syscall (although the VM_DROPPABLE is optimized for memory pressure
>> situations, it still requires page tables entries).
>>
>> >
>> > Best,
>> > Yifan
>> >
>> > image
>> > Schrodinger ZHU Yifan, Ph.D. Student
>> > Computer Science Department, University of Rochester
>> >
>> > *Personal Email:* i@...yi.fan
>> > *Work Email:* yifanzhu@...hester.edu
>> > *Website:* https://www.cs.rochester.edu/~yzhu104/Main.html
>> > *Github:* SchrodingerZhu
>> > *GPG Fingerprint:* BA02CBEB8CB5D8181E9368304D2CC545A78DBCC3
>>
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.