|
Message-ID: <20220502211856.GR7074@brightrain.aerifal.cx> Date: Mon, 2 May 2022 17:18:56 -0400 From: Rich Felker <dalias@...c.org> To: Alexey Izbyshev <izbyshev@...ras.ru> Cc: musl@...ts.openwall.com Subject: Re: vfork()-based posix_spawn() has more failure modes than fork()-based one On Mon, May 02, 2022 at 10:26:36PM +0300, Alexey Izbyshev wrote: > Hi, > > I was recently made aware via [1] that vfork() can have more failure > modes than fork() on Linux. The only case I know about is due to > Linux not allowing processes in different time namespaces to share > address space, but probably there are or will be more. An example is > below (requires Linux >= 5.6). > > $ cat test.c > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <spawn.h> > #include <sys/wait.h> > #include <unistd.h> > > int main(int argc, char *argv[], char *envp[]) { > if (getenv("TEST_FORK")) { > pid_t pid = fork(); > if (pid < 0) { > perror("fork"); > return 127; > } > if (pid == 0) { > execve(argv[1], argv + 1, envp); > _exit(127); > } > } else { > int err = posix_spawn(0, argv[1], 0, 0, argv + 1, envp); > if (err) { > printf("posix_spawn: %s\n", strerror(err)); > return 127; > } > } > wait(NULL); > return 0; > } > > $ musl-gcc test.c > $ unshare -UrT ./a.out /bin/echo OK > posix_spawn: Invalid argument > $ TEST_FORK=1 unshare -UrT ./a.out /bin/echo OK > OK > > A common expectation from applications is that they can use > posix_spawn() as a drop-in replacement for fork()/exec() (when its > child-tweaking features are sufficient), but this case breaks the > expectation. Do you think it would make sense for musl to fallback > to fork() in case vfork() fails in posix_spawn()? > > I've also opened a bug about this in glibc[2]. Maybe libcs could > coordinate in how they handle this case. > > Alexey > > [1] https://github.com/python/cpython/issues/91307 > [2] https://sourceware.org/bugzilla/show_bug.cgi?id=29115 I'm trying to understand how this comes to be. The child should inherit the namespaces of the parent and thus should not be in a different namespace that precludes spawn. I'm guessing this is some oddity where unshare doesn't affect the process itself, only its children? If so, it seems like a bug that it doesn't affect the process itself after execve (after unshare(1) runs your test program), but that probably can't be fixed now on the Linux side for stability reasons. :( For what it's worth, I feel like the answer here is really that you can't expect everything (or anything) to work after you've created a bad or inconsistent process state, which can be done in various ways like using unshare(2) in certain ways a multithreaded process, certain manual uses of clone(2), etc. Apparently unsharing time ns is one of those things too, and if it behaves the way it seems to, I don't think you can use it at all without an extra fork (adding -f to the unshare(1) command line). Otherwise the top-level process in your "container" and its children will be in different time namespaces, which is not at all what you would want anyway. We probably could make posix_spawn retry __clone without CLONE_VM if if fails with certain errors, as long as those errors are non-ambiguous about indicating a need for retry. I don't see EINVAL documented as being possible for any cases that would need to be treated as errors, but then again it doesn't seem to be documented for this corner case you found either. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.