Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHy7VY4AkPFCgqjCmMrt7p5vpoC9V-Sw10ZnbZrbQG2umz1qjw@mail.gmail.com>
Date: Fri, 20 Oct 2017 11:02:02 +0200
From: Petr Skocik <pskocik@...il.com>
To: musl@...ts.openwall.com
Subject: Re: posix_spawn topics [was: Re: musl 1.1.17 released]

Thanks for the feedback. I was using it modified like this because  I need
the enhanced adddup2 semantics since I want to do all my opens in the
parent for more fine-grained error reporting.  So I only really care about
clearing the FD_CLOEXEC flag in the FDOP_DUP2 case, and so I'm glad you're
willing to accept the semantics change. (Attached is perhaps a mergeable
patch.)

On Fri, Oct 20, 2017 at 3:17 AM, Rich Felker <dalias@...c.org> wrote:

> On Fri, Oct 20, 2017 at 02:11:35AM +0200, Petr Skocik wrote:
> > Hi.
> > Since posix_spawn is being discussed, I'd like to report a what I think
> is
> > a bug in the current implementation:
> > Since the child is unmasking its signals only after it's performed its
> file
> > actions, the process may get blocked indefinitely (e.g., on a FIFO
> > open) while being unresponsive to signals.
> >
> > Example program (with close to no error checking):
> >
> >
> > #define _GNU_SOURCE
> > #include <unistd.h>
> > #include <spawn.h>
> > #include <stdlib.h>
> > #include <fcntl.h>
> > #include <sys/types.h>
> > #include <sys/stat.h>
> > #include <sys/wait.h>
> > int main()
> > {
> > pid_t pid;
> > mkfifo("fifo", 0640);
> > posix_spawn_file_actions_t fa;
> > posix_spawn_file_actions_init(&fa);
> > posix_spawn_file_actions_addopen(&fa, 0, "fifo", O_RDONLY, 0);
> > posix_spawnp(&pid, "tr", &fa, 0, (char *const[]){ "tr", "a-z", "A-Z", 0},
> > environ);
> >
> >         //will get stuck here
> >
> > }
> >
> >
> > It  think the pthread_mask call should be moved above the file actions,
> > which fixes this problem.
>
> It's actually rather ugly that this happens at all. Your "fix" is
> presumably that ^C now works, which of course helps. But short of
> terminals signals there's no legitimate way to kill the hung process
> because its pid has not been returned yet.
>
> I'm not really sure what should happen here; it may call for an
> interpretation request.
>
> > Additionally, as far as extensions to the current POSIX standard are
> > concerned, adding the herein (http://austingroupbugs.net/view.php?id=411
> )
> > proposed change to *adddup2 semantics (clearing the FD_CLOEXEC flag on
> the
> >  (given target in dup2 file actions where the source and target
> > filedescriptor are identical) would be super nice (the reasoning for it
> > should be in the link).
>
> I wasn't aware of that part of #411; I agree it should be adopted now
> without waiting for a new standard.
>
> > Finally, I noticed the error passing code can be reduced to a simple
> write
> > to a volatile int made from the child (purely cosmetic).
>
> The pipe is there for a very important reason, and the reason is not
> passing the error. It just happens that, since we already have to read
> from it in the parent anyway, it makes a convenient channel to pass
> the error on. Your modified version has serious memory corruption
> because the parent continues running (and may return from posix_spawn)
> after args->ret is set but before the child exits, thereby ending the
> lifetime of the child's stack. Atomicity of closing of the
> close-on-exec pipe with respect to exec is the property we're using
> here to detect when exec has taken effect and memory is no longer
> shared.
>
> An argument can be made that CLONE_VFORK mandates the schedudler
> prevent this, but historically it did not always do this correctly,
> especially when traced (and I believe it's still possible when tracing
> to manually resume the parent while the child is still sharing
> memory), and the intent is to be compatible with kernels where
> CLONE_VFORK is a nop. The main reason CLONE_VFORK is used is actually
> as a hint to qemu-userlevel to get it to emulate things right.
>
> > Attached are patches for these 3 things in case you wanted them. (I hope
> > I'm not doing something stupid.)
>
> Otherwise no, not "stupid", but the patches are a lot larger/more
> invasive than they need to be and don't follow coding style.
>
> > @@ -97,14 +108,14 @@ static int child(void *args_vp)
> >                               __syscall(SYS_close, op->fd);
> >                               break;
> >                       case FDOP_DUP2:
> > -                             if ((ret=__sys_dup2(op->srcfd, op->fd))<0)
> > +                             if ((ret=dup3_or_set_cloexec(op->srcfd,
> op->fd, 0))<0)
> >                                       goto fail;
> >                               break;
> >                       case FDOP_OPEN:
> >                               fd = __sys_open(op->path, op->oflag,
> op->mode);
> >                               if ((ret=fd) < 0) goto fail;
> >                               if (fd != op->fd) {
> > -                                     if ((ret=__sys_dup2(fd, op->fd))<0)
> > +                                     if ((ret=dup3_or_set_cloexec(fd,
> op->fd, 0))<0)
>
> This (FDOP_OPEN) is not a case covered by the POSIX interpretation,
> and using it here would actually be wrong if it were reachable, but
> you're already inside "if (fd != op->fd)" and thus there's no way
> fd==op->fd can be true. So the change in this line is just spurious.
> There's then just one place where dup3_or_set_cloexec is called (the
> previous case), and it doesn't make sense to have a function; rather
> the conditional should just be inline in that case.
>
> Rich
>

Content of type "text/html" skipped

View attachment "0001-New-posix_spawn_file_actions_adddup2-semantics.patch" of type "text/x-patch" (1168 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.