|
Message-ID: <20130115134244.GW20323@brightrain.aerifal.cx> Date: Tue, 15 Jan 2013 08:42:44 -0500 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: Re: REG_STARTEND (regex) On Tue, Jan 15, 2013 at 11:34:59AM +0100, Daniel Cegiełka wrote: > Hi, > Is there a chance that musl will support REG_STARTEND? It is used > quite often in *BSD. > > http://www.sourceware.org/ml/libc-alpha/2004-03/msg00038.html Probably not, at least not in the immediate future. The original TRE code actually worked with strings as a base+length rather than null-terminated internally, which meant a lot of things were a lot more expensive they should be; if I remember correctly, even searches for text guaranteed to be found near the beginning of the string required strlen for the whole string, i.e. the whole operation was needlessly O(n). In one of the cleanup rounds, I changed it to use null termination, which simplified a lot of the tests; many checks collapsed away since \0 was automatically not in the set being checked against and thus no second check was requried. If/when we overhaul regex again, I'll certainly consider this request and see if the design can be made such that it's not expensive. But I don't see any easy way to do it right now short of making a temp copy of the string. That _would_ be possible; \0 could be replaced with \xff, and \xff replaced with \fe, and special logic added to allow \xff (which is otherwise an invalid byte and never matchable) while still rejecting \xfe and other invalid bytes. This would require no changes to the internals, but it would have the property of requiring an O(n) malloc/memcpy, which is certainly not very appealing. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.