|
Message-ID: <20240724183708.GF10433@brightrain.aerifal.cx> Date: Wed, 24 Jul 2024 14:37:09 -0400 From: Rich Felker <dalias@...c.org> To: libc-coord@...ts.openwall.com Subject: Re: Allocating for execve and related functions On Mon, Jul 22, 2024 at 08:37:04AM +0200, Florian Weimer wrote: > In some cases, it is necessary to allocate before making an execve > system call. In execvp and similar functions, space for constructing > the pathname is needed. Assuming existence of a PATH_MAX, the constructed path can be assumed to fit in automatic storage and doesn't require allocation. If an implementation choses not to have a PATH_MAX, that's a fun way of shooting oneself in the foot in many many places... but thankfully it looks like there's a solution anyway (see below). > For execl, the argument vector needs to be The argument vector is just pointers, and these pointers were already passed to execl on the stack, so the storage for them is at least *of the same order* as the size the caller has already assumed the stack to be (roughly 2x). I think this makes it fairly reasonable to construct on-stack as a VLA, crashing with stack overflow if it doesn't fit (since the execl call itself would already have crashed similarly from passing too many args; you're just changing the threshold within the same order of magnitude). In the real world, you don't call execl with hundreds of arguments; Translation Limits probably don't even consider that valid. You use one of the execv forms if you need a large or variable-length argument vector; execl is for small, fixed numbers of args. > built. Some functions have fallback to the shell for missing script > interpreters, which also requires copying the argument vector. This is the one case where allocation really is needed, I think. The existing argument vector is not on the incoming stack and can't be assumed to be tiny. If you have a low ARG_MAX (no contract to accept or refuse larger ones), you could potentially assume ARG_MAX/2*sizeof(void*) fits on stack and fail if argc exceeds ARG_MAX/2, but even that is quite large for stack. > Thread-safe environment access may require a copy of the environment > vector. I don't think this is a reasonable motivation. The environment fundamentally cannot be made thread-safe to modify. The interfaces don't admit doing that. And I don't think there's any reasonable way you could make exec* obtain a lock to copy it while still being AS-safe. At the very least you'd have to make all accesses to the environment block and unblock signals to make the lock AS-safe, which would be prohibitively slow for many real-world uses. > The allocation needs to be performed in an async-signal-safe fashion, > but that isn't the main problem. In a vfork scenario, the allocation > happens in the original process, and if execve is successful, any > allocation leaks. > > Has anyone found a way to work around this? A single per-thread buffer > again runs into signal safety issues. Maybe a stack of buffers, and > cleanup code in vfork for anything allocated in the new process? If this needs to be supported, I think what you can do is have the vfork asm tail-call in the parent to a cleanup function that inspects TLS for a pointer to an allocation made by mmap in the child and unmaps it if present. I don't see any need for "stack of buffers". There's at most one block of data that needs to be freed: one containing everything that had to be marshalled into the SYS_execve or SYS_execveat syscall. Anything else allocated admits an opportunity to free it before the chile ceases to exist. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.