Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 27 Feb 2024 09:49:27 -0500
From: Rich Felker <dalias@...c.org>
To: James Tirta Halim <tirtajames45@...il.com>
Cc: musl@...ts.openwall.com
Subject: Re: [PATCH] add memcmpeq: memcmp that returns length of first
 mismatch

On Tue, Feb 27, 2024 at 09:07:56PM +0700, James Tirta Halim wrote:
> (unaligned access must be supported for mempcmpeq if using word access)
> 
> mempcmpeq returns the length of the first mismatched byte. If S1 and S2
> are equal, n is returned.
> 
> The function is meant to be used internally in strstr and memmem:
> https://inbox.vuxu.org/musl/20181108193451.GD5150@brightrain.aerifal.cx/
> 
> glibc bench-memcmpeq timings (Core i3-1115G4):
> mempcmpeq memcmp_musl memcmp __memcmpeq_evex __memcmpeq_avx2 __memcmpeq_sse2
> Average:
> 62.3978 269.813 16.0426 14.9072 14.1125 20.6527
> Total:
> 30637.3 132478 7876.92 7319.43 6929.23 10140.5
> 
> Passes glibc test-memcmpeq.
> 
> Return value in testing and benchmarking is changed to suit the
> behavior of memcmpeq.

I don't understand this proposal. The only "memcmpeq" I can find
anywhere else is __memcmpeq in glibc, whose contract is very
different: it's a relaxed version of memcmp that's only usable for
testing equality and that does not give an order relationship.

As noted in the above-linked message, a portable C version of this
function is not going to help strstr or memmem because there's no
reason to expect the needle to be aligned modulo any wordsize in the
haystack. For it to help, you would need a version that's capable of
doing misaligned accesses, contingent on arch support for that.
Without that, you're just adding the cost of function call overhead
and making it slower.

> ---
>  src/string/mempcmpeq.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>  create mode 100644 src/string/mempcmpeq.c
> 
> diff --git a/src/string/mempcmpeq.c b/src/string/mempcmpeq.c
> new file mode 100644
> index 00000000..bc17bc58
> --- /dev/null
> +++ b/src/string/mempcmpeq.c
> @@ -0,0 +1,19 @@
> +#include <stddef.h>
> +
> +size_t
> +mempcmpeq(const void *s1,
> +          const void *s2,
> +          size_t n)
> +{
> +	const size_t length = n;
> +#ifdef __GNUC__
> +	typedef size_t __attribute__((__may_alias__)) word;
> +	const unsigned char *p1 = (const unsigned char *)s1;
> +	const unsigned char *p2 = (const unsigned char *)s2;
> +	for (; n >= sizeof(word) && *(word *)p1 == *(word *)p2; p1+=sizeof(word), p2+=sizeof(word), n-=sizeof(word));
> +#endif
> +	for (; n; --n)
> +		if (*p1++ != *p2++)
> +			return (size_t)((p1 - 1 - (const unsigned char *)s1));
> +	return length;
> +}
> -- 
> 2.44.0

This code does not do what you claim it does and does not seem to
match your test results. As written, it should return immediately,
because the entire body except

> +	const size_t length = n;
> +	return length;

is dead code.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.