libc-coord - Re: Allocating for execve and related functions

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8734nxvld2.fsf@oldenburg.str.redhat.com>
Date: Thu, 25 Jul 2024 12:04:09 +0200
From: Florian Weimer <fweimer@...hat.com>
To: Rich Felker <dalias@...c.org>
Cc: libc-coord@...ts.openwall.com
Subject: Re: Allocating for execve and related functions

* Rich Felker:

>> Thread-safe environment access may require a copy of the environment
>> vector.
>
> I don't think this is a reasonable motivation. The environment
> fundamentally cannot be made thread-safe to modify. The interfaces
> don't admit doing that. And I don't think there's any reasonable way
> you could make exec* obtain a lock to copy it while still being
> AS-safe. At the very least you'd have to make all accesses to the
> environment block and unblock signals to make the lock AS-safe, which
> would be prohibitively slow for many real-world uses.

I think I have found a solution with just a global counter.  We can
tolerate one pointer write (in the usual array-based
setenv/getenv/unsetenv implementation) due to the way the array updates
are carried out.

(The glibc implementation never frees strings allocated by setenv, which
certainly helps here.  It's not something we can change at this point.)

>> The allocation needs to be performed in an async-signal-safe fashion,
>> but that isn't the main problem.  In a vfork scenario, the allocation
>> happens in the original process, and if execve is successful, any
>> allocation leaks.
>> 
>> Has anyone found a way to work around this?  A single per-thread buffer
>> again runs into signal safety issues.  Maybe a stack of buffers, and
>> cleanup code in vfork for anything allocated in the new process?
>
> If this needs to be supported, I think what you can do is have the
> vfork asm tail-call in the parent to a cleanup function that inspects
> TLS for a pointer to an allocation made by mmap in the child and
> unmaps it if present.

Yes, I considered that as well.  And as a last resort, freeing at thread
exit should help, too.  It's not as prompt, but it avoids the need for
changing assembler vfork implementations immediately.

> I don't see any need for "stack of buffers".  There's at most one
> block of data that needs to be freed: one containing everything that
> had to be marshalled into the SYS_execve or SYS_execveat
> syscall. Anything else allocated admits an opportunity to free it
> before the chile ceases to exist.

I'm worried about what happens if a signal arrives while we are in
execve and already processing a buffer.  For the environment snapshot,
we can actually write to the same buffer because what we write will be
the same or at least compatible with interupted execve.  But that means
we can't deallocate previous buffers which are too small, so we still
need a chain of buffers to free.

Thanks,
Florian

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.