|
Message-ID: <20121118210353.GA4844@brightrain.aerifal.cx> Date: Sun, 18 Nov 2012 16:03:53 -0500 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: debloating data, bss Hi all, I've been doing a bit of checking for unneeded bloat, and here's what I've found in data & bss: 1045 0 288 1333 535 mntent.o (ex lib/libc.a) This is simply a line buffer. It seems to me we could instead use getline or fgetln, but for the latter some consideration is needed to determine whether its semantics suffice. 665 0 32 697 2b9 mq_notify.o (ex lib/libc.a) This is purely gcc being stupid and putting static const char[32] into bss rather than text (wasting 32 bytes of writable memory to save 32 bytes on disk). Since the buffer is junk, we could just use "char[32]" (uninitialized); that would shrink the code but would result in warnings. Or, we could use a pointer to any static object or code of at least 32 bytes in size. 86 0 544 630 276 gethostbyaddr.o (ex lib/libc.a) 78 0 544 622 26e gethostbyname2.o (ex lib/libc.a) Some ugly gigantic buffers for results. Since getaddrinfo requires dynamic allocation anyway, it would be reasonable to dynamically allocate these too; it would not be introducing a failure case that did not already previously exist. 6 0 512 518 206 res_state.o (ex lib/libc.a) This is pure junk; it's just there to satisfy broken programs that try to peek/poke at the resolver state. I wonder if we could make it smaller without breaking anything. 908 160 12 1080 438 random.o (ex lib/libc.a) Unfortunately I think random really does have that much state... 43 0 128 171 ab sigisemptyset.o (ex lib/libc.a) This is another case of gcc stupidly putting uninitialized static const in bss instead of text. I have a better workaround anyway though; anyway this code needs to be fixed because it's comparing the while 1024-bit bit-array even though we treat all but the first 64/128 bits as padding now. 209 0 8192 8401 20d1 pthread_key_create.o (ex lib/libc.a) I'm considering replacing pthread_key_create with a new implementation that makes a fake DSO with TLS instead of having the pthread thread-specific data being part of the main thread block. Aside from this, the main issue that's making libc.so's dirty-page cost so high is that the bss isn't sorted; most of bss is unused in most programs, but because the commonly-used stuff isn't grouped together, several pages end up dirty. With this in mind, I'm considering one of the following 3 approaches to get the commonly-used data all together in one page: 1. Explicitly initialize everything that's always-used, so it ends up in .data rather than .bss, and thus on the first page. 2. Reorder object files in the linking so that the bloated junk is all at the end. 3. Find a way to get the linker to sort it for us, possibly with alignment and alignment-based sorting. With the above changes, I think we should be able to cut 2-3 pages of commit charged off of libc.so and drop the minimum dirty pages for dynamic linking from 20k (5 pages) to 12k (3 pages, only one of which is in libc.so; the others are the main app's data and stack). Major work on debloating will probably not begin until after the next release. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.