Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zd65MSY_v8bibtp7@voyager>
Date: Wed, 28 Feb 2024 05:40:17 +0100
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Cc: James Tirta Halim <tirtajames45@...il.com>
Subject: Re: [PATCH] add memcmpeq: memcmp that returns length of first
 mismatch

Am Tue, Feb 27, 2024 at 09:51:24AM -0500 schrieb Rich Felker:
> On Tue, Feb 27, 2024 at 09:49:27AM -0500, Rich Felker wrote:
> > On Tue, Feb 27, 2024 at 09:07:56PM +0700, James Tirta Halim wrote:
> > > ---
> > >  src/string/mempcmpeq.c | 19 +++++++++++++++++++
> > >  1 file changed, 19 insertions(+)
> > >  create mode 100644 src/string/mempcmpeq.c
> > >
> > > diff --git a/src/string/mempcmpeq.c b/src/string/mempcmpeq.c
> > > new file mode 100644
> > > index 00000000..bc17bc58
> > > --- /dev/null
> > > +++ b/src/string/mempcmpeq.c
> > > @@ -0,0 +1,19 @@
> > > +#include <stddef.h>
> > > +
> > > +size_t
> > > +mempcmpeq(const void *s1,
> > > +          const void *s2,
> > > +          size_t n)
> > > +{
> > > +	const size_t length = n;
> > > +#ifdef __GNUC__
> > > +	typedef size_t __attribute__((__may_alias__)) word;
> > > +	const unsigned char *p1 = (const unsigned char *)s1;
> > > +	const unsigned char *p2 = (const unsigned char *)s2;
> > > +	for (; n >= sizeof(word) && *(word *)p1 == *(word *)p2; p1+=sizeof(word), p2+=sizeof(word), n-=sizeof(word));
> > > +#endif
> > > +	for (; n; --n)
> > > +		if (*p1++ != *p2++)
> > > +			return (size_t)((p1 - 1 - (const unsigned char *)s1));
> > > +	return length;
> > > +}
> > > --
> > > 2.44.0
> >
> > This code does not do what you claim it does and does not seem to
> > match your test results. As written, it should return immediately,
> > because the entire body except
> >
> > > +	const size_t length = n;
> > > +	return length;
> >
> > is dead code.
>
> Sorry, this part is wrong -- I missed the return above, so it appears
> it actually does behave as described. Dunno how I overlooked that.
> EMORECOFFEE.
>
> Rich

And I puzzled over it a solid five minutes wondering how you got to your
conclusion. But there are a few more issues here:
1. The declarations of p1 and p2 need to be moved outside the #ifdef,
since they are used outside the #ifdef.
2. The explicit conversions in the initializations are unnecessary and
should be elided.
3. If this function is to be used by strstr, it needs to get a double
underscore name, since strstr is an ISO-C function and must only call
ISO-C functions or double underscore functions (wonky counterexamples
like system() notwithstanding. And even that one doesn't call extension
functions). If the function should also be an externally accessible
extension, it can get a weak alias.
4. All the other musl C code avoids misaligned word access. I don't know
which architecture/ABI doesn't allow it, but it is nevertheless the
case. And I don't know if this function is still worth it if it has to
take all possible misalignments into account.

Ciao,
Markus

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.