|
Message-ID: <20150321013016.GS23507@brightrain.aerifal.cx> Date: Fri, 20 Mar 2015 21:30:16 -0400 From: Rich Felker <dalias@...c.org> To: Konstantin Serebryany <konstantin.s.serebryany@...il.com>, musl@...ts.openwall.com Subject: Re: buffer overflow in regcomp and a way to find more of those On Sat, Mar 21, 2015 at 02:23:41AM +0100, Szabolcs Nagy wrote: > * Konstantin Serebryany <konstantin.s.serebryany@...il.com> [2015-03-20 18:10:18 -0700]: > > After your fix the fuzzer did not find anything else so far, but it > > suffers from slow performance on some cases. > > Not sure if this qualifies for a bug, but the following example takes > > ~2 seconds to run (runs instantly with glibc): > > i think the problem is stacked repetitions > tre doesnt handle them in a sane way > and uses huge amount of ram Sadly there doesn't seem to be any sane way to handle them... > for * it would be easy to solve, but > the general case is theoretically impossible to > solve: x{255}{255} will be a 255*255 state machine > > this is the only part in the musl regex > engine that's allowed to have super linear > space/time complexity > > (you might want to add some logic to avoid > such stacked repetitions to speed up the search) > > (btw the standard does not allow these, but if > the pattern is parenthesized around every repetition > then that's ok: (x*)* is a valid pattern, x** is not, > so there is not much point rejecting these patterns > the problem does not go away since grouping is allowed) > > > int main() { > > regex_t preg; > > const char *s = ".****\\Z$<\\0)_"; Isn't the \0 an invalid backreference? Could it be getting processed in a way that's causing the slowdown, but simply rejected by glibc? Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.