Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200513214931.GB21576@brightrain.aerifal.cx>
Date: Wed, 13 May 2020 17:49:32 -0400
From: Rich Felker <dalias@...ifal.cx>
To: Anders Magnusson <ragge@...d.ltu.se>
Cc: John Arnold <iohannes.eduardus.arnold@...il.com>,
	musl@...ts.openwall.com, pcc@...ts.ludd.ltu.se
Subject: Re: Re: [Pcc] PCC unable to build musl 1.2.0 (and
 likely earlier)

On Wed, May 13, 2020 at 10:31:31PM +0200, Anders Magnusson wrote:
> 
> 
> Den 2020-05-13 kl. 21:33, skrev Rich Felker:
> >On Wed, May 13, 2020 at 09:09:13PM +0200, Anders Magnusson wrote:
> >>Den 2020-05-13 kl. 16:30, skrev Rich Felker:
> >>>On Wed, May 13, 2020 at 09:10:40AM +0200, Anders Magnusson wrote:
> >>>>Den 2020-05-12 kl. 23:21, skrev Rich Felker:
> >>>>>Thanks. Adding pcc list to cc.
> >>>>>
> >>>>>On Tue, May 12, 2020 at 03:59:36PM -0500, John Arnold wrote:
> >>>>>>With an i386 PCC 1.2.0.DEVEL built from source from
> >>>>>>http://pcc.ludd.ltu.se/ftp/pub/pcc/pcc-20200510.tgz, I was unable to
> >>>>>>build an i386 musl 1.2.0. The compiler first hits this error:
> >>>>>>
> >>>>>>../include/limits.h:10: error: bad charcon
> >>>>>>
> >>>>>>This line was the only change made in commit cdbbcfb8f5d, but it has a
> >>>>>>lengthy commit message about the proper way of determining CHAR_MIN
> >>>>>>and CHAR_MAX.
> >>>>>I think this is clearly a PCC bug, one they can hopefully fix. The
> >>>>>commit message cites the example from 6.4.4.4:
> >>>>Can you please sen med the offending line?
> >>>#if '\xff' > 0
> >>>
> >>Thanks, fixed now, it was a missing pushback of ' that was the problem.
> >>
> >>Note that this check cannot be used to see whether a target uses
> >>signed or unsigned char.
> >>In pcc the above is always true, no matter what char is.  See C11
> >>clause 6.10.1 clause 4.
> >See the commit message for:
> >
> >https://git.musl-libc.org/cgit/musl/commit/include/limits.h?id=cdbbcfb8f5d748f17694a5cc404af4b9381ff95f
> >
> >There is good reason we changed this.
> >
> >I believe you're referring to the text:
> >
> >     "This includes interpreting character constants, which may involve
> >     converting escape sequences into execution character set members.
> >     Whether the numeric value for these character constants matches
> >     the value obtained when an identical character constant occurs in
> >     an expression (other than within a #if or #elif directive) is
> >     implementation-defined.168) Also, whether a single-character
> >     character constant may have a negative value is
> >     implementation-defined."
> >
> Actually, the ambiguous handling of negative values in #if is
> historical behaviour, and has nothing to do with EBCDIC.

I mean the 'z'-'a' differing between #if and if() is an EBCDIC
artifact. Indeed the sign thing is more likely motivated by differing
historical behaviors in a subtle corner case than by mixed charset
environments.

> It do not sound very good to rely on explicitly documented undefined
> behaviour IMHO,

It's not undefined. It's implementation-defined, and generally
implementation-defined means roughly psABI-defined, or in other words
"should match for all interoperable implementations". One way of
thinking about this as an "ABI" issue is that 2 object files compiled
by different compilers, with foo.h containing:

#if 'z'-a'==25
#define func func1
#else
#define func func2
#endif

and one defining func and the other calling func, should successfully
link if the compilers are interoperable.

> and this is actually the first time in the last 20
> years that someone has complained about it :-)

:-)

> It might be possible to change it (due to the "law of least
> surprise") but since cpp do not have any relation to the target
> architecture it needs some thinking. (cpp is the same even if
> multiple target backends are generated).

I'm pretty sure this is subtly wrong then because the signedness of
wchar_t varies by target, and while the *values* may be allowed to
vary, whether L'\0' has preprocessor type uintmax_t or intmax_t has to
match whether wchar_t is unsigned or signed.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.