|
Message-ID: <20180315145247.GE1436@brightrain.aerifal.cx> Date: Thu, 15 Mar 2018 10:52:47 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Musl incompatibility with Docker and AWS's C5 class On Thu, Mar 15, 2018 at 09:37:28AM -0400, Ryan Wilson-Perkin wrote: > Hey musl-devs, > > Yesterday we tested out the new C5 instance class that AWS offers using our > Alpine-based images and discovered that we would get a segfault whenever we > ran `npm install`. Tracing the code, it appeared to be happening due to the > use of node's "process.setuid" and "process.setgid" commands, either of > which would cause a segfault. > > We're running Alpine containers inside Docker on EC2, and the smallest > thing I can provide to reproduce this issue would be to run the following > on a C5 EC2 instance: > > docker run -it node:9-alpine sh -c "node -e 'process.setgid(0)'" > > A core dump provided the following limited information: > > > Program terminated with signal SIGSEGV, Segmentation fault. > warning: Unexpected size of section `.reg-xstate/26' in core file. > #0 __cp_end () at src/thread/x86_64/syscall_cp.s:29 > 29 src/thread/x86_64/syscall_cp.s: No such file or directory. > [Current thread is 1 (LWP 26)] > (gdb) bt > #0 __cp_end () at src/thread/x86_64/syscall_cp.s:29 > #1 0x00007fd6161eecd8 in __syscall_cp_c (nr=202, u=<optimized out>, > v=<optimized out>, w=<optimized out>, x=<optimized out>, y=<optimized out>, > z=0) at src/thread/pthread_cancel.c:35 > #2 0x00007fd6161ee2f5 in __timedwait_cp (addr=addr@...ry=0x5612e9ebf820, > val=val@...ry=-1, clk=clk@...ry=0, at=at@...ry=0x0, > priv=<optimized out>) at src/thread/__timedwait.c:31 > #3 0x00007fd6161f0e2c in sem_timedwait (sem=0x5612e9ebf820, at=0x0) at > src/thread/sem_timedwait.c:23 > #4 0x00007fd615d7a5a4 in uv_sem_wait () from /usr/lib/libuv.so.1 > #5 0x00005612e94dc00c in node::DebugSignalThreadMain(void*) () > #6 0x00007fd6161ef665 in start (p=0x7fd616424ab0) at > src/thread/pthread_create.c:145 > #7 0x00007fd6161f13e4 in __clone () at src/thread/x86_64/clone.s:21 > Backtrace stopped: frame did not save the PC Changing uids/gids in a multithreaded process involves synchronizing all the threads with a signal. Based on the information, my guess is that the stack for at least one thread is barely large enough, and when the signal arrives, creation of the signal frame (in the kernel) overflows the stack and the kernel generates SIGSEGV for the process. One approach to test if this is the case and mitigate it: LD_PRELOAD a library that calls pthread_setattr_default_np from a constructor to set a larger default thread stack size. If that turns out to be the problem, the Alpine node package should probably be patched to increase the stack size. We may also be increasing the default in musl somewhat (from 80k to 128k or so) in the near future; if so it would likely be enough to solve your problem here. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.