|
Message-ID: <20131221234041.GA13204@brightrain.aerifal.cx> Date: Sat, 21 Dec 2013 18:40:41 -0500 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: Removing sbrk and brk Based on a discussion on IRC, we're thinking about removing support for the legacy sbrk and brk functions. These functions are fundamentally broken and unfixable: For an application to use them correctly, it must depend on the malloc subsystem never being used, but this is impossible to guarantee since malloc may be used internally by libc functions without documenting this to the application. The interference is two-way: unexpected malloc would interfere with an application's management of the heap via sbrk, and unexpected sbrk interferes with malloc. Other implementations of malloc (e.g. in glibc) handle this "gracefully", just splitting the heap and leaving an unfreeable block where the application made a mess by calling sbrk. musl's malloc does not even check for this (it would be a mess working it in with musl's page-at-a-time brk adjustment logic since the application's sbrk adjustments might not be page-aligned) and therefore horribly crashes if the application has used sbrk/brk itself. As far as I can tell, the only remotely legitimate use for sbrk/brk is for applications to provide their own malloc implementation using it. This (redefining malloc) is not supported by musl (per ISO C and POSIX, it results in undefined behavior) so there's really no legitimate way musl-linked programs can be using sbrk/brk. What we have encountered is certain programs and libraries (most notably Boehm GC, but also programs that try to redefine malloc by default, such as some versions of bash) causing horrible memory corruption and runtime crashes that are hard to track down, due to their use of sbrk. Based on today's discussion, I think the cleanest solution is just to eliminate sbrk/brk. This could be done in one of several ways: - making them always-fail - making the headers break use of them - completely removing the symbols The latter options are in some ways preferable, since the failure would be caught at build-time and the program could be patched (or different configure options passed) to fix the incorrect sbrk usage. Unfortunately, this might break otherwise-correct programs that just use sbrk(0) as a stupid way to "measure heap usage" or similar. I'm not sure if that's an acceptable cost. Another option would be providing dummy sbrk of some sort: - sbrk(positive) == malloc(positive), sbrk(negative) == fail. This of course leads to memory leaks, but any usage of sbrk is potentially-leaky anyway since you can't always undo it safely. - First call to sbrk or brk creates a large PROT_NONE mapping to provide a fake heap, and subsequent calls adjust an emulated brk pointer in this region and use mprotect to 'allocate'/'free'. - Something else? Finally, another alternative might be leaving sbrk/brk alone and modifying malloc not to use the brk at all. This has been proposed several times (well, supporting non-brk allocation has been proposed anyway) to avoid spurious malloc failures when the brk cannot be extended, and if we support that we might as well just drop brk support in malloc (otherwise there's code with duplicate functionality and thus more bloat). So this might actually be the best long-term option. Switching malloc from using brk to PROT_NONE/mprotect (see the above idea for brk emulation) would also make the malloc implementation more portable to systems with no concept of brk. However this option would definitely be a post-1.0 development direction, and not something we could do right away (of course I'd probably hold off until after 1.0 for any of these changes since they're fairly invasive, except possibly the idea of making sbrk always-fail). Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.