musl - Re: Left-shift of negative number

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <55B3D4DA.109@openwall.com>
Date: Sat, 25 Jul 2015 21:26:34 +0300
From: Alexander Cherepanov <ch3root@...nwall.com>
To: musl@...ts.openwall.com
Subject: Re: Left-shift of negative number

On 2015-07-25 06:22, Rich Felker wrote:
> On Fri, Jul 17, 2015 at 05:28:58PM -0400, Rich Felker wrote:
>> On Fri, Jul 17, 2015 at 06:28:00PM +0000, Loïc Runarvot wrote:
>>>
>>> According to the C11 standard, doing a left-shift on a negative
>>> integer is considered as an undefined behavior (6.5.7:4).
>>>
>>> This undefined behavior occurs in files src/multibyte/internal.c and
>>> src/multibyte/internal.h. At line 21 in the header
>>> (http://git.musl-libc.org/cgit/musl/tree/src/multibyte/internal.h?id=0f9c2666aca95eb98eb0ef4f4d8d1473c8ce3fa0#n21),
>>> the implementation of the macro-definition R allow to have a
>>> negative value on the expression ((a == 0x80) ? 0x40-b : -a) << 23.
>>>
>>> In fact, in the source file, at the line 11
>>> (http://git.musl-libc.org/cgit/musl/tree/src/multibyte/internal.c?id=0f9c2666aca95eb98eb0ef4f4d8d1473c8ce3fa0#n11).
>>> During the application of the macro-definition R(0x90, 0xc0), we
>>> have a != 0x90, so it's try to do (-0x90) << 23, which is an
>>> undefined behavior.
>>
>> Thank you. Reporting of such issues is very welcome, as it is the
>> intent in musl to avoid undefined behavior regardless of whether it's
>> believed to cause problems with current compilers. The cleanest
>> solution is probably to use unsigned arithmetic here (e.g. replace -a
>> with 0u-a or -(unsigned)a) but I'd like to look at the code in more
>> detail again and check all of the consequences before committing to a
>> particular approach to fixing it.
>
> This looks like the best approach, and the macro is only used in
> initializers so it was easy to confirm that the object file is not
> changed. I also considered replacing <<23 with *(1<<23), which is a
> standard idiom I'd like to promote for working around the standard's
> failure to define left-shift of negative numbers properly, but
> ensuring that the multiplication doesn't overflow is non-trivial
> without re-examining the logic, so I'd rather just work with unsigned
> arithmetic.
>
> I've gone ahead and made the change as commit
> fe7582f4f92152ab60e9523bf146fe28ceae51f6. If anything looks wrong,
> please let me know. Thanks again for the bug report.

The new definition of R:

#define R(a,b) ((uint32_t)((a==0x80 ? 0x40u-b : 0u-a) << 23))

It implicitly casts a and b to unsigned (and triggers 
-Wsign-conversion). Isn't it better to express it explicitly, e.g. by 
moving the cast to uint32_t inside the conditional operator? Or maybe 
more intuitive to move the work with negative numbers outside the 
conditional operator:

#define R(a,b) (-(uint32_t)(a==0x80 ? b-0x40 : a) << 23)

While at it, maybe change -1 to -1u in the definition of C in internal.c 
(triggers -Wsign-compare)?

-- 
Alexander Cherepanov

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.