Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171230222204.GF1627@brightrain.aerifal.cx>
Date: Sat, 30 Dec 2017 17:22:04 -0500
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Conformance problem in system()

On Fri, Dec 29, 2017 at 10:48:48AM +0100, Markus Wichmann wrote:
> Hi all,
> 
> I was testing system() while SIGCHLD is ignored just now, and noticed a
> difference between glibc and musl. That in itself is not alarming, but
> it prompted me to look this function up in the standard, to see which
> behavior was conforming better (or maybe both are).
> 
> The program was this:
> 
> #include <signal.h>
> #include <stdio.h>
> #include <stdlib.h>
> 
> int main(void)
> {
>     if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
>         perror("signal");
>     int i = system("sleep 1");
>     if (i != 0)
>         fprintf(stderr, "system() == %d, %m\n", i);
>     return i;
> }
> 
> Glibc returns -1 from system() while musl returns 32512. errno is the
> same in both cases, ECHILD. Which is also what waitpid() is returning.
> 
> SUSv4 specifies for system()'s return value:
> 
> | [...][I]f the termination status for the command language interpreter
> | cannot be obtained, system() shall return -1 and set errno to indicate
> | the error.
> 
> So, it turns out, glibc is conforming to the standard, and musl is not.
> If waitpid() fails, then the status cannot be retrieved.
> 
> I don't know about other system calls, but I tend to view pointer
> arguments to system calls as undefined after a failed call. The kernel
> might deposit something in those variables, or it might not. In this
> case, it does not, so the status returned is the value status was
> initialized to at the start of system(), which is 0x7f00, the value I
> observed. I think you were hoping to land under the umbrella of
> 
> | If command is not a null pointer, system() shall return the termination
> | status of the command language interpreter in the format specified by
> | waitpid().
> 
> But a failing waitpid() does not fetch the status at all.
> 
> I have attached a patch to correct this behavior, for your perusal.

I think you're right that there's a problem here, but I don't think
the patch correctly or fully fixes it. A simpler version of what
you're doing would be to just initialize status to -1 instead of
0x7f00, since your patch is returning -1 in all cases where waitpid
did not complete successfully. But that ignores the POSIX requirement
to behave as if the interpreter exited with status 127 when it was
possibel to make the child process but the command interpreter could
not be executed.

musl's posix_spawn does not succeed when exec fails in the child;
instead the exec error is returned. This behavior is permitted but not
required by POSIX. I think it would actually be preferable to system
to return -1 and set errno in this case too, but POSIX doesn't seem to
allow that. So I think we actually need to break down error cases for
posix_spawn and simulate the exit(127) result if the error returned is
anything other than an error that could be returned by fork (EAGAIN or
ENOMEM).

Does that sound right?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.