Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250117064634.GK10433@brightrain.aerifal.cx>
Date: Fri, 17 Jan 2025 01:46:35 -0500
From: Rich Felker <dalias@...c.org>
To: Askar Safin <safinaskar@...omail.com>
Cc: musl <musl@...ts.openwall.com>
Subject: Re: [bug] Ctrl-Z when process is doing posix_spawn makes the
 process hard to kill

On Fri, Jan 17, 2025 at 01:37:09AM -0500, Rich Felker wrote:
> On Fri, Jan 17, 2025 at 03:14:03AM +0400, Askar Safin wrote:
> > I found a bug both in glibc and musl.
> > 
> > If a process does posix_spawn+waitpid, then attempting to pause it using Ctrl-Z
> > sometimes doesn't work and, worse, makes the process unkillable by usual Ctrl-Z or Ctrl-C.
> > 
> > The bug is described in full in this glibc issue: https://sourceware.org/bugzilla/show_bug.cgi?id=32565 .
> > 
> > It is reproducible with musl on the same system I used to reproduce it with glibc (see the link).
> > 
> > I compiled the code using "x86_64-linux-musl-gcc" wrapper provided by Debian.
> > 
> > Please, CC me when replying.
> 
> OK, I think this should be fixable by, if SIGTSTP is to be SIG_DFL in
> the spawned child, setting it to a no-op handler instead of SIG_DFL.
> It might actually make sense to just do this for all signals.
> 
> Note that SIGSTOP, which is not blockable interceptible or ignorable,
> can't be handled this way, but the pid has not yet leaked to anything
> at this point, so the only way SIGSTOP can be generated is by a badly
> behaved program signaling random pids, which is not a case that needs
> to be handled gracefully.
> 
> In theory SIGTTIN and SIGTTOU might be hazards too, but I don't think
> it's possible for a process to generate them without attempting to
> perform io, which the pre-exec child doesn't do. Still handling them
> might be a good safety measure in case I'm wrong.
> 
> I'll prepare one or more versions of a proposed patch.

One complication I'll need to address: the pre-exec child does not
have enough stack to execute (even a no-op) signal handler. So the
parent is going to need to handle checking the runtime-variable min
signal stack and ensuring it provides enough. And the no-op signal
handler will need to be installed to run with all signals blocked so
that recursive signals can't overflow a limit that only suffices for
one signal.

With those changes I think this approach works.

I think applying it to all signals is probably a bad idea in that it
would introduce a lot more syscall cost at spawn time. Just doing the
signals we need (and probably omitting SITTTIN/TTOU unless there's
good reason to believe they can happen) seems like the smart approach
not to make the fix annoyingly costly to users.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.