|
Message-ID: <20200309185536.GI14278@port70.net> Date: Mon, 9 Mar 2020 19:55:37 +0100 From: Szabolcs Nagy <nsz@...t70.net> To: Pirmin Walthert <pirmin.walthert@...om.ch> Cc: musl@...ts.openwall.com Subject: Re: Re: FYI: some observations when testing next-gen malloc * Pirmin Walthert <pirmin.walthert@...om.ch> [2020-03-09 19:14:59 +0100]: > Am 09.03.20 um 18:12 schrieb Rich Felker: > > On Mon, Mar 09, 2020 at 05:49:02PM +0100, Pirmin Walthert wrote: > > > I'd like to mention that I am not yet entirely sure whether the > > > following is a problem with the new malloc code or with asterisk > > > itself but maybe you can already keep the following in the back of > > > your head if someone else is reporting similar behavior with a > > > different application: > > > > > > We use asterisk (16.7) in a musl libc based distribution and for > > > some operations asterisk forks (in a thread) the main process to > > > execute a system command. When using libmallocng.so (newest version > > > with "fix race condition in lock-free path of free" applied, but > > > already without that change) some of these forked child processes > > > will hang during a call to pthread_mutex_unlock. > > > > > > Unfortunatelly the backtrace is not of much help I guess, but the > > > child process always seems to hang on pthread_mutex_unlock. So > > > something seems to happen with the mutex on fork: > > > > > > #0 0x00007f2152a20092 in pthread_mutex_unlock () from > > > /lib/ld-musl-x86_64.so.1 > > > No symbol table info available. > > > #1 0x0000000000000008 in ?? () > > > No symbol table info available. > > > #2 0x0000000000000000 in ?? () > > > No symbol table info available. > > > > > > I will for sure try to dig into this further. For the moment the > > > only thing I know is that I did not yet observe this on any of the > > > several hundred systems with musl 1.1.23 (same asterisk version), > > > not on any of the around 5 with 1.2.0 (same asterisk version, old > > > malloc) but quite frequently on the two systems with 1.1.24 and > > > libmallocng.so. > > This is completely expected and should happen with old or new malloc. > > I'm surprised you haven't hit it before. After a multithreaded process > > calls fork, the child inherits a state where locks may be permanently > > held. See https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html > > > > - A process shall be created with a single thread. If a > > multi-threaded process calls fork(), the new process shall > > contain a replica of the calling thread and its entire address > > space, possibly including the states of mutexes and other > > resources. Consequently, to avoid errors, the child process may > > only execute async-signal-safe operations until such time as one > > of the exec functions is called. > > > > It's not described very rigorously, but effectively it's in an async > > signal context and can only call functions which are AS-safe. > > > > A future version of the standard is expected to drop the requirement > > that fork itself be async-signal-safe, and may thereby add > > requirements to synchronize against some or all internal locks so that > > the child can inherit a working context. But the right solution here is > > always to stop using fork without exec. > > > > Rich > > Well, I have now changed the code a bit to make sure that no > async-signal-unsafe command is being executed before execl. Things I've > removed: > > a call to cap_from_text, cap_set_proc and cap_free has been removed as well > as sched_setscheduler. Now the only thing being executed before execl in the > child process is closefrom() closefrom is not as-safe. i think it reads /proc/self/fd directory to close fds. (haven't checked the specific asterisk version) opendir calls malloc so it can deadlock. > > However I got a hanging process again: > > (gdb) bt full > #0 0x00007f42f649c6da in __syscall_cp_c () from /lib/ld-musl-x86_64.so.1 > No symbol table info available. > #1 0x0000000000000000 in ?? () > No symbol table info available. > > Best regards, > > Pirmin
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.