|
Message-Id: <20130220221827.6ee8a6ff.idunham@lavabit.com> Date: Wed, 20 Feb 2013 22:18:27 -0800 From: Isaac Dunham <idunham@...abit.com> To: musl@...ts.openwall.com Subject: Re: strcasestr.c On Wed, 20 Feb 2013 20:03:28 -0500 Rich Felker <dalias@...ifal.cx> wrote: > > Yes, it seems to have been an early mistake I made getting wget to > work. Unfortunately, I think it would be bad policy to remove it now > since that would break existing dynamic binaries using it, but one > could make an argument that breaking them is "right" since they were > already broken (not behaving as intended)... > > > >Since strcasestr is nonstandard and not clearly specified, > > > > it's so non-standard that even nobody uses it. > > i looked up the usage of the function in codesearch.debian.net, and > > the only *user* (from all ~20K debian packages) of the function is > > gnu wget. > > Are you sure this search was correct? IIRC there were more... A quick check here indicates that busybox, mutt, git, midnight commander, sylpheed, foomatic-rip, elinks, and a couple libraries use it. Busycox uses it in grep and for checking passwords (see libbb/obscure.c). > I think leaving it as-is is the worst case. It's impossible to detect > without runtime checks that it's incorrect, so it might encourage > configure scripts to add runtime checks for broken strcasestr that > break cross compiling. Or it might lead to programs just assuming it's > correct, then breaking. > > > in any case it doesnt make sense to put much work and especially > > much code into it. > > if it's gonna be implemented "correctly" at all, it should be as > > slim as possible, in the order of 3-5 LOC. > > As much as I appreciate Todd's interest in contributing a 2way-based > version, I tend to agree. Not only would adopting the 2way code be a > fairly large code addition for a never-used feature, but it would also > bind us to the choice to do ASCII-only case mapping or drop > performance drastically in the future if we want to change that > decision. > > My leaning right now would be to write the naive strstr loop using > strcasecmp instead of strcmp (or an inline loop) for the inner loop. > This will cause strcasestr to have the exact same case-folding > semantics as strcasecmp, whatever those are in the future. This is > best for consistency. Unfortunately, it's very bad from a performance > standpoint, but I don't know of any code using this function for > high-performance use. Busybox implements their own version using this approach. > The other somewhat reasonable option would be removing the function, > which would expose breakage in programs that were already using the > broken version in musl. I'm mildly against this, but I'd be interested > in hearing arguments either way. Were the claimed frequency correct, I would want it gone. As it stands, I think that a small but slow version is justifiable. A large one isn't. -- Isaac Dunham <idunham@...abit.com>
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.