|
Message-ID: <544A0152.4040201@i-soft.com.cn> Date: Fri, 24 Oct 2014 15:35:46 +0800 From: 黄建忠 <jianzhong.huang@...oft.com.cn> To: musl@...ts.openwall.com Subject: Re: musl pthread/tls issue. Great clue, Thanks. It's a stack overflow issue. The default pthread stacksize is 81920, that's 80k. I increase the stacksize to 8M and this bug disappear. I had tried add locks, make local copies and even found it's a over flow issue, But so stupid to forget the thread stacksize issue(since it's sufficient defaultly under glibc.) And about the webkit, the different codebase of webkitgtk had different behaviors: 2.4.x run but report a exception of RangeError. 2.6.x(they call it webkitgtk4) use the same codebase as ewebkit, directly segfault. I guess it's related to the "fastmalloc" of JavaScriptCore. On 10/22/14 15:45, Szabolcs Nagy wrote: > * ?????? <jianzhong.huang@...oft.com.cn> [2014-10-22 14:33:01 +0800]: >> These days, I finished build a bootable x86_64 system(rpm based) include >> musl/systemd/dracut/gcc-4.9.1/gcc-5/clang-3.5 and wayland/Xorg and the >> whole GNOME-3.14 desktop(except webkit js segfault issue I mentioned >> before) with a lot of patches(I will release all of them someday until >> it reach a stable state.) >> >> After a simple try, I found gnome-shell will segfault If I triggered the >> app list(not always but often). >> >> The dmesg report "pool [<some pid>] segfault xxxxxxxxxxx >> libpixman-xxxxx", That's to say, it segfault in pixman library(A common >> library used by Xorg and cairo), >> gdb report it's a thread issue(a thread of gnome-shell) and segfault at >> the beginning of general_composite_rect function in pixman-general.c, >> the pointer of argument can not be accessed. >> > that's not enough info.. > > both the webkit js and this crash sounds like thread stack overflow > >> That's to say, there must be a problem exist in musl pthread/tls >> implementation and can be triggered under certain circumstances. Please >> help to solve it. >> > i don't believe that without evidence: general_composite_rect itself > allocates >24k on the stack, that is about a third of the musl default > stack size > > you can verify it by checking the diff of the top and bottom of the stack > (gdb backtrace prints the stack pointer, if the diff is >56k when that > func was entered then this was the problem) or looking at /proc/pid/maps > and if the crash happened in a guard page after a thread stack > > to fix: make the application create a larger thread stack eg 1M > (pthread_attr_setstacksize, but gnome* will use gthread most likely > which has different api) > -- Huang JianZhong
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.