Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20121115132603.GG20323@brightrain.aerifal.cx>
Date: Thu, 15 Nov 2012 08:26:03 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: type of wchar_t

On Thu, Nov 15, 2012 at 04:36:31PM +0400, Yuri Kozlov wrote:
> > so we either use the __WCHAR_TYPE__ defined by the
> > compiler (when it's defined), or use the abi specs
> > (which gives the align+size+sign information and
> > hopefully compilers agree on a single int type when
> > there are multiple choices)
> 
> Thanks for clarification.
> Hah, gcc emit a __WCHAR_TYPE__ for arm as unsigned. Wow.
> $ arm-linux-gnueabi-gcc -dM -E - < /dev/null |grep __WCHAR_T
> #define __WCHAR_TYPE__ unsigned int

Yes. Whoever designed this aspect of the ARM EABI did not know what
they were doing. They probably came from a Windows background where
wchar_t is unsigned short (to be able to represent all of the Unicode
BMP) and did not realize that making it unsigned is unnecessary and
even harmful when it's 32-bit and thus able to store all of Unicode
(and much more) in a signed type.

As already explained, I wanted to just always use a signed type on
musl, but since L"" must match the type of wchar_t* (otherwise,
passing L"" to a function that expects wchar_t* is a constraint
violation and the compiler should throw an error), we need the
definition to agree with whatever the compiler thinks it is, and
real-world compilers follow the EABI document that defines it as
unsigned.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.