Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5694F495.3010203@openwall.com>
Date: Tue, 12 Jan 2016 15:41:57 +0300
From: Alexander Cherepanov <ch3root@...nwall.com>
To: musl@...ts.openwall.com
Subject: Re: string word-at-a-time and atomic.h FAQ on twitter

On 2016-01-09 01:59, Rich Felker wrote:
> On Sat, Jan 09, 2016 at 01:39:10AM +0300, Alexander Cherepanov wrote:
>>>>>> this takes care of oob access, but the bytes outside the passed
>>>>>> object might change concurrently i.e. strlen might introduce a
>>>>>> data race: again this is a problem on the abstract c language
>>>>>> level that may be solved e.g. by making all accesses to those
>>>>>> bytes relaxed atomic, but user code is not under libc control.
>>>>>> in practice the code works if HASZERO reads the word once so it
>>>>>> does arithmetics with a consistent value (because the memory
>>>>>> model of the underlying machine does not treat such race
>>>>>> undefined and it does not propagate unspecified value bits nor
>>>>>> has trap representations).
>>>>>
>>>>> Indeed, this seems like less of a practical concern.
>>>>
>>>> HASZERO reads the word twice so this should be a problem for
>>>> unoptimized code on big-endian platforms.
>>>
>>> The number of abstract-machine reads is irrelevant unless we use
>>> volatile here. A good compiler will always reduce it to one read, and
>>> a bad compiler is always free to turn it into multiple reads.
>>
>> Ok, I'll reformulate: is compiling musl on a big-endian platform
>> with optimizations turned off officially supported?
>
> Yes, and I don't see why you expect this case to break due to data
> race issues.

Right, I was too fast. I checked that outside bytes affect computations 
on inside bytes (on BE platforms) but the effect is not strong enough to 
break the code. Sorry for the noise.

Perhaps a comment is warranted in this and other hairy cases (like qsort)?

As for a list of affected functions, the first approximation:

$ grep -rF HASZERO | grep -v '#define'
src/string/stpcpy.c:		for (; !HASZERO(*ws); *wd++ = *ws++);
src/string/memchr.c:		for (w = (const void *)s; n>=SS && !HASZERO(*w^k); 
w++, n-=SS);
src/string/strlen.c:	for (w = (const void *)s; !HASZERO(*w); w++);
src/string/strchrnul.c:	for (w = (void *)s; !HASZERO(*w) && 
!HASZERO(*w^k); w++);
src/string/memccpy.c:		for (; n>=sizeof(size_t) && !HASZERO(*ws^k);
src/string/strlcpy.c:			for (; n>=sizeof(size_t) && !HASZERO(*ws);
src/string/stpncpy.c:		for (; n>=sizeof(size_t) && !HASZERO(*ws);

HASZERO is not guarded by a size check in strpcpy, strlen and strchrnul.

-- 
Alexander Cherepanov

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.