Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210127154506.GH23432@brightrain.aerifal.cx>
Date: Wed, 27 Jan 2021 10:45:06 -0500
From: Rich Felker <dalias@...ifal.cx>
To: Andrey Melnikov <temnota.am@...il.com>
Cc: musl@...ts.openwall.com
Subject: Re: Re: move __BYTE_ORDER definition to alltypes.h

On Wed, Jan 27, 2021 at 11:12:41AM +0300, Andrey Melnikov wrote:
> ср, 27 янв. 2021 г. в 00:40, Rich Felker <dalias@...ifal.cx>:
> >
> > On Wed, Jan 27, 2021 at 12:29:05AM +0300, Andrey Melnikov wrote:
> > > вт, 26 янв. 2021 г. в 20:23, Rich Felker <dalias@...ifal.cx>:
> > > >
> > > > On Tue, Jan 26, 2021 at 02:55:09PM +0300, Andrey Melnikov wrote:
> > > > > Hi.
> > > > >
> > > > > Your commit 97d35a552ec5b6ddf7923dd2f9a8eb973526acea leads to
> > > > > miscompile programs which rely on one of defines __LITTLE_ENDIAN or
> > > > > __BIG_ENDIAN.
> > > > > Now, both unconditionally  defined when included stdarg.h and programs
> > > > > which define __(BIG|LITTE)_ENDIAN itself - miscompiled. linux kernel
> > > > > for example - it internally uses #if defined __BIG_ENDIAN and defines
> > > > > it only  for BIGENDAIN arches.
> > > > >
> > > > > Any ideas?
> > > >
> > > > The conditionally-defined macros that on some archs tell you the
> > > > endianness are __BIG_ENDIAN__ and __LITTLE_ENDIAN__ (note the final
> > >
> > > Yes, defined by compiller.
> > >
> > > > __) or other arch-specific macros. __BIG_ENDIAN and __LITTLE_ENDIAN
> > > > (without the final __) have always been the possible values for
> > > > __BYTE_ORDER from endian.h.
> > >
> > > >From <endian.h> - its correct headers for this, not from <stdarg.h>.
> > > <stdarg.h> is for va_* macros.
> > >
> > > > In any case, all of these are in the
> > > > reserved namespace and should not be defined by applications or
> > > > inspected in any way other than a manner documented by the
> > > > implementation.
> > >
> > > Where is it reserved?
> >
> > 7.1.3 Reserved identifiers
> >
> > "All identifiers that begin with an underscore and either an uppercase
> > letter or another underscore are always reserved for any use."
> 
> > > > How did this come up with the Linux kernel? AFAIK it uses -nostdinc
> > > > and should not see the libc headers at all. But if that #ifdef is
> > > > present in Linux it's probably a bug since it's contrary to all
> > > > historical use of __BIG_ENDIAN...
> > >
> > > Easy. Build an out-of-tree module that includes <linux/kernel.h> which
> > > is include <stdarg.h> which is define all variant _ENDIAN macros,
> > > later it include <asm/byteorder.h> which is include
> > > <linux/byteorder/little_endian.h> in my case,
> > > in <linux/byteorder/little_endian.h> (read
> > > uapi/linux/byteorder/little_endian.h) we have defines:
> >
> > This also requires -nostdinc. The libc headers are not suitable for
> > compiling kernel code. But in this case they should not be breaking
> > anything.
> 
> Kernel need va_* macros and definitions for printk(). And <stdarg.h>
> under glibc does not exist, it comes from gcc/llvm distribution
> includes.

Again, you cannot use libc headers for building the kernel or kernel
modules. musl is not glibc. It uses a BSD-like include order and
defines all of the headers itself rather than using a mix of gcc
headers and libc-provided ones. Add -nostdinc and -I the gcc header
path (like the kernel's own build system does) or you will hit obscure
problems like this again even if changes are made to the endian macros
that make your immediate problem go away.

> > > #ifndef __LITTLE_ENDIAN
> > > #define __LITTLE_ENDIAN 1234
> > > #endif
> > > #ifndef __LITTLE_ENDIAN_BITFIELD
> > > #define __LITTLE_ENDIAN_BITFIELD
> > > #endif
> > >
> > > and later, include <linux/etherdevice.h> in module for
> > > is_multicast_ether_addr() function:
> > >
> > > static inline bool is_multicast_ether_addr(const u8 *addr)
> > > {
> > > #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
> > >         u32 a = *(const u32 *)addr;
> > > #else
> > >         u16 a = *(const u16 *)addr;
> > > #endif
> > > #ifdef __BIG_ENDIAN
> > >         return 0x01 & (a >> ((sizeof(a) * 8) - 8));
> > > #else
> > >         return 0x01 & a;
> > > #endif1234
> > > }
> >
> > This code is just wrong. It should be #if __BYTE_ORDER == __BIG_ENDIAN
> This code works under glibc/musl-1.1.x and now - broken.

This would be a regression if it were doing something violating the
reserved namespace, but it's not. musl does not aim for
bug-compatibility, and if any application entitled to use libc headers
(as opposed to the kernel) were doing sketchy things with the
__BIG_ENDIAN macro, it would be an application bug.

But since it has potential for silent wrong-code generation it may be
appropriate to make changes here anyway.

> > or #ifdef __BIG_ENDIAN__. Not #ifdef __BIG_ENDIAN. Where did the code
> > come from?
> https://github.com/torvalds/linux/blob/2ab38c17aac10bf55ab3efde4c4db3893d8691d2/include/linux/etherdevice.h#L116

Thanks. A quick grep -r __BIG_ENDIAN include/linux shows a good deal
of inconsistent usage of this macro (some places with #if __BYTE_ORDER
== ..., some with #ifdef). It's not unconditionally wrong for them to
use the macro in a different way than libc does, since from the
standpoint of building the kernel, Linux is "the implementation" and
is the one that owns the entire reserved namespace (apart from what it
assumes the compiler might be defining). But it is wrong to mix libc
headers in such a context, for this and other reasons.

It's not urgent but I think the inconsistency should probably be
raised with the kernel folks at some point. It's error-prone and
confusing to readers who are used to both endian macros being defined
as possible values for __BYTE_ORDER rather than one.

> > Whoever wrote it misunderstood the meanings of these macros.
> > Even within the kernel they're not intended to be used this way.
> Kernel code does not expect random defines coming from random headers.
> Remove #include <bits/alltypes.h> from stdarg.h - it's simply not
> needed here. Nothing in this header need to know types/endianness. Or

You cannot remove bits/alltypes.h or va_list will not be defined.

> remove this header completely - this compiler built-in include.

is not compatible with musl.

What possibly can be done is removing the __BIG_ENDIAN and
__LITTLE_ENDIAN macros and just using the values 1234 and 4321
explicitly. I don't see any strong reason not to do that, and it's
probably what I had in an earlier draft of the relevant changes.

The reason __BYTE_ORDER is defined here (alltypes.h) is that it's
necessary to correctly define some of the types that alltypes.h is
capable of exposing. The alltypes.h mechanism itself is used to avoid
redundant definitions and guard logic against multiple definition for
types which are exposed by more than one header. va_list is such a
type because stdio.h, wchar.h, and syslog.h expose it too.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.