Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZLRdasdXiHLTxKx7@fuz.su>
Date: Sun, 16 Jul 2023 23:13:14 +0200
From: Robert Clausecker <fuz@....su>
To: musl@...ts.openwall.com
Subject: Re: strcmp() guarantees and assumptions

Hi Markus,

Am Sun, Jul 16, 2023 at 09:33:16PM +0200 schrieb Markus Wichmann:
> Am Sun, Jul 16, 2023 at 07:59:57PM +0200 schrieb Robert Clausecker:
> > That's good to hear.  Any idea on the “what do existing libc
> > implementations permit” bit?
> >
> 
> So I quickly checked musl, dietlibc, bionic, and glibc, and
> unsurprisingly, all of the implementations I looked at allow the strings
> to be unterminated if they mismatch before access becomes restricted.
> This is, of course, an implementation detail that applications must not
> rely on, but it nevertheless is the case.
> 
> The problem in your implementation is that the calls to strlen() will
> iterate over both input strings to the end, causing basically a cache
> flush for large inputs, only to then iterate over both inputs a second
> time. Iterating only once is a major benefit, since it avoids half of
> the cache misses.

Of course.  This was merely a simple example to demonstrate the general
point.  I of course do not plan to do anything like that.

> Also, glibc already has an SSE strcmp implementation you may want to
> look at.

I'm not going to look at glibc as it's LGPL licensed.  I am aware of the
Intel implementation, but I don't like that it has to duplicate the code
16 times for each possible misalignment pattern.  Without having to
ensure that a cacheline of data is only touched once we confirm there
is no previous mismatch, it might be possible to write simpler code, but
I'm currently not entirely sure how.

> Ciao,
> Markus

Yours,
Robert Clausecker

-- 
()  ascii ribbon campaign - for an 8-bit clean world 
/\  - against html email  - against proprietary attachments

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.