|
Message-ID: <233508429.1118514.1495915215350@mail.yahoo.com> Date: Sat, 27 May 2017 20:00:15 +0000 (UTC) From: Brad Conroy <technosaurus@...oo.com> To: <musl@...ts.openwall.com> Subject: SSE2 strcasecmp The recent discussion of tolower performance prompted me to dig out my SSE2 version of strcasecmp. It's about the same number of instructions as musl's generic strcasecmp (although slightly larger compiled due to SIMD instruction) int strcasecmp_sse2(const char *s0, const char *s1){ __m128i *l =(__m128i*)s0, *r=(__m128i*)s1, all0 = (__m128i){0}, all1 = (__m128i){-1,-1}, allA = _mm_set1_epi8('A'-1), allZ = _mm_set1_epi8('Z'+1), all32 = _mm_set1_epi8(1<<5), lcl, lcr, tmp; unsigned m; size_t i = 0; do{ lcl = _mm_loadu_si128 (l+i); lcr = _mm_loadu_si128 (r+i); tmp = _mm_cmpeq_epi8(lcl,all0); lcl |= (_mm_cmpgt_epi8(lcl,allA) & _mm_cmplt_epi8(lcl,allZ) & all32); lcr |= (_mm_cmpgt_epi8(lcr,allA) & _mm_cmplt_epi8(lcr,allZ) & all32); tmp |= (_mm_cmpeq_epi8(lcl,lcr) ^ all1); ++i; }while(!(m=_mm_movemask_epi8(tmp))); return ((union{__m128i v;char c[16];})(lcl-lcr)).c[__builtin_ctz(m)]; }
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.