|
Message-ID: <CAGQ9bdzihNcZoWzpZghnhwwkt3YdwLtBZr3WA_UKn1xL_gzJdQ@mail.gmail.com> Date: Sun, 22 Mar 2015 21:55:26 -0700 From: Konstantin Serebryany <konstantin.s.serebryany@...il.com> To: Konstantin Serebryany <konstantin.s.serebryany@...il.com>, Rich Felker <dalias@...c.org>, musl@...ts.openwall.com Subject: Re: buffer overflow in regcomp and a way to find more of those On Sat, Mar 21, 2015 at 6:28 AM, Szabolcs Nagy <nsz@...t70.net> wrote: > * Konstantin Serebryany <konstantin.s.serebryany@...il.com> [2015-03-20 23:05:13 -0700]: >> On Fri, Mar 20, 2015 at 7:20 PM, Rich Felker <dalias@...c.org> wrote: >> > On Fri, Mar 20, 2015 at 07:14:33PM -0700, Konstantin Serebryany wrote: >> >> If you build the source with "-fsanitize=leak -fsanitize-coverage=4 >> >> -O1" the compiler will not insert any of the asan instrumentation >> >> and only insert calls to a couple of functions needed for coverage. >> >> Then, instead of linking with the full asan+coverage run-time, you >> >> will need a very simple re-implementation of coverage-only runtime. >> > >> > Could the existing runtime be used, just stripped down? >> >> Yes, but for the basic functionality needed by the fuzzer it's simpler >> to write it from scratch, see below: >> >> ======================================================== >> svn co http://llvm.org/svn/llvm-project/llvm/trunk/lib/Fuzzer >> cat <<EOF >cov-minimal-rt.c >> static long counter; >> void __sanitizer_cov_with_check(int *guard) { >> if (*guard == 0) { >> counter++; >> *guard=1; >> } >> } >> long __sanitizer_get_total_unique_coverage() { return counter; } >> void __sanitizer_cov_module_init() {} >> void __sanitizer_reset_coverage(){} >> void __sanitizer_get_coverage_guards(){} >> void __sanitizer_get_number_of_counters(){} >> void __sanitizer_update_counter_bitset_and_clear_counters(){} >> void __sanitizer_set_death_callback(){} >> EOF >> >> clang -std=c++11 -c Fuzzer/Fuzzer*.cpp -I Fuzzer >> clang -std=c++11 -fsanitize=leak -fsanitize-coverage=3 -mllvm >> -sanitizer-coverage-block-threshold=0 Fuzzer/test/SimpleTest.cpp -c >> clang -c cov-minimal-rt.c >> clang++ *.o >> ./a.out >> ======================================================== > > with this i could run the fuzzer against libc.a > > it's a bit more work to link to libc.a than adding > a -L so i attached the scripts i used (and an example) > so others can reproduce it > > c++ headers cannot be used in the test (that would > require cleaning up the libstdc++ header mess) > > but i think there is no reason to use c++ for these > libc api tests anyway Sure. > > you may need to adjust the directories the scripts use > > (the linking may need to change when compiler-rt is > used instead of libgcc) > > usage: > > cd workdir > ./buildfuzz.sh > ./buildmusl.sh > ./fuzzcompile.sh reg.c > ./fuzzlink.sh reg.o > ./a.out > > of course to make it useful the malloc magic is needed for > more likely crashes > >> The recently added afl-style counters >> (https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Coverage_counters) >> are a bit more involved, but the basic bool-per-edge is quite enough >> in most cases. >> > > ok > >> The fuzzer itself is written in C++ and uses STL (probably, not the >> best idea, but it makes the experiments simpler). >> Can't tell if it will be a problem with musl, but after all the fuzzer >> itself is also trivial (as well as the entire concept) >> > > c++ happens to work because musl is (almost) abi compatible with > glibc on x86 so we can just link to the glibc linked libstdc++ > > (this can eg fail when the c++ thread local storage destructor > abi is used, that is not implemented in musl yet) > > so yes c++ makes things more painful: you need to recompile the > entire toolchain to make it work reliably (and then both gcc > and clang have broken assumptions about the libc so you have to > patch them) which is too much work for running tests > >> > Well static linking with musl does not impose any constraint on >> > redefining functions, so you could easily use a debugging malloc that >> > lines up each allocation to end on a page boundary with a guard page >> > after it. >> >> Yea... This will slowdown fuzzing and guard pages only protect you >> from overflow in one direction (ether left, of right, but not both). >> But this is better than nothing. >> > > you can run the tests twice (for left and right) :) > >> > This would of course be slow and use lots of memory but >> > would catch all heap overflows. And -fstack-protector-all would catch >> > most stack-based overflows. >> >> Only stack-overflow-write by a small amount, but yes, better than nothing. >> >> BTW, writing a minimalistic asan run-time as part of musl should be a >> matter of a couple of hours. >> Probably much faster than making the current monster work with static linking. >> I'd be happy to help with such. >> > > how would this look? > > compile the tests and libc with asan, but instead of linking the > asan runtime from clang use a musl specific one? Yes > > i assume for that we still need to change the libc startup code, malloc > functions and may be some things around thread stacks Try to compile a simple file with asan: int main(int argc, char **argv) { int a[10]; a[argc * 10] = 0; return 0; } % clang -fsanitize=address a.c -c % nm a.o | grep U U __asan_init_v5 U __asan_option_detect_stack_use_after_return U __asan_report_store4 U __asan_stack_malloc_1 __asan_report_store4 should print an error message saying that "bad write of 4 bytes" happened in <current stack trace> on address <param>. Also make other __asan_report_{store,load}{1,2,4,8,16} __asan_init_v5 will be called by the module initializer. When called for the first time, it should mmap the shadow memory. https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm __asan_option_detect_stack_use_after_return is a global, define it to 0. __asan_stack_malloc_1 -- just make it an empty function. Now, you can build a code with asan and detect stack buffer overflows. (The reports won't be very detailed, but they will be correct). If you add poisoned redzones to malloc -- you get heap buffer overflows. If you delay the reuse of free-d memory -- you get use-after-free. If you then implement __asan_register_globals (it is called on module initialization and poisons redzones for globals) you get global buffer overflows. The current asan run-time is large an hairy because it attempts to be thread-friendly, intercepts lots of libc, and provides very details error messages. W/o all that, the run-time will easily fit in < 100 LOC, which can be a part of a libc implementation. hth, --kcc
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.