|
Message-ID: <ZlhoTMKsjk79zT3w@voyager> Date: Thu, 30 May 2024 13:51:40 +0200 From: Markus Wichmann <nullplan@....net> To: musl@...ts.openwall.com Cc: Pablo Correa Gomez <pabloyoyoista@...tmarketos.org> Subject: Re: Crash in kill(..., SIGHUP) when using SA_ONSTACK Am Thu, May 30, 2024 at 12:17:59PM +0200 schrieb Pablo Correa Gomez: > El mie, 29-05-2024 a las 09:15 -0400, Rich Felker escribió: > > On Wed, May 29, 2024 at 02:04:25PM +0200, Pablo Correa Gomez wrote: > > > Thread 1 "unix" received signal SIGSEGV, Segmentation fault. > > > 0x00007ffff7fa96e8 in __syscall2 (a2=1, a1=17483, n=62) at > > > ../arch/x86_64/syscall_arch.h:21 > (gdb) layout asm > > 0x7ffff7fa96f9 <kill+7> movslq %esi,%rsi > 0x7ffff7fa96fc <kill+10> mov $0x3e,%eax > 0x7ffff7fa9701 <kill+15> syscall > >0x7ffff7fa9703 <kill+17> mov %rax,%rdi > [...] > Does this tell you anything? > It tells me that Rich's reasoning was correct. I'll explain further down. > > I'm not sure if the crashing code is running on the signal stack or > > main stack, but here's a thought: is it possible the CI machines are > > running on a cpu/kernel with some monster AVX512 or whatever > > extension > > enabled with register file that doesn't fit in MINSIGSTKSZ? > > That might be the case. Would explain why I could not reproduce in my > 9-year old laptop I was running last month, but I can reproduce it now > in a new machine with a 13th Gen Intel(R) Core(TM) i7-1360P > That is exactly what the program is doing, according to the link you provided in the OP. > > It's also possible that the kernel may have some weird behavior > > deciding if the task is already "running on the alt stack" when the > > alt stack is embedded in the normal stack like this. Just getting rid > > of that might be worth trying. If so, whether the problem manifests > > could be subject to timing of signal delivery (although I would not > > expect that for synchronously generated signals like here). > > Thankfully, we needn't speculate, as Linux is open source. The function get_sigframe() will determine if the thread is currently executing on the signal stack. It does that by determining that the sp is between stack base and stack top. If that isn't the case, it will allocate a red zone, else it will start at the top of the altstack. It will then try to allocate a full frame. If that doesn't work (because it already was on an altstack that got overflowed, or it tried to enter too small of an altstack), then it will generate a message "overflowed sigaltstack", that you might find in dmesg, before returning a bogus address. Due to the bogus address, all calls to unsafe_put_user() in x64_setup_rt_frame() will fail, and it will return EFAULT. This error will be reported to signal_setup_done() and it will call force_sigsegv(), which then reports a SIGSEGV at the "current" IP. Since this all happens during a syscall, the current IP is the one directly following the syscall instruction. > > Rich > Ciao, Markus
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.