|
Message-ID: <20181108193451.GD5150@brightrain.aerifal.cx> Date: Thu, 8 Nov 2018 14:34:51 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: strstr & memcmp stuff It's come up on libc-alpha that the performance bottleneck for strstr (in glibc, but the same applies to musl) is likely the forward/reverse memcmp-like comparison loops: https://sourceware.org/ml/libc-alpha/2018-11/msg00193.html The reason the proposed "quicksearch" (a really naive quadratic strstr with no optimizations but a bad character shift table) does well on glibc is because they have extremely optimized strcmp. musl's strcmp is a trivial C loop, and as a consequence quicksearch performs abysmally here. (Note that quicksearch is not interesting, but getting the same benefits for a good strstr is.) Optimizing memcmp has been something we've kinda needed to do for a long time. What the above suggests to me is that, rather than writing an optimized standalone memcmp (which has an awful interface, because it throws away valuable information, the location of the mismatch), we should write an optimized memcmp core that returns the location of the first mismatch, and that can be shared between memcmp, strstr, and memmem. I'm not sure how much portable optimization can be done (since there's no reason for the operands to have matching alignment, word-sized accesses to both might not be possible), but arch-specific optimized versions are definitely possible. It might be possible/reasonable to do something like the C memcpy for the case where the operands are misaligned with respect to each other. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.