Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260513110656.GA3520958@port70.net>
Date: Wed, 13 May 2026 13:06:56 +0200
From: Szabolcs Nagy <nsz@...t70.net>
To: Luca Kellermann <mailto.luca.kellermann@...il.com>
Cc: Rich Felker <dalias@...c.org>, musl@...ts.openwall.com
Subject: Re: musl multi-level table format for binary locale images

* Luca Kellermann <mailto.luca.kellermann@...il.com> [2026-05-13 05:07:41 +0200]:
> On Tue, May 12, 2026 at 07:09:32PM -0400, Rich Felker wrote:
> > [...]
> > 
> > The code to perform lookups is not yet merged much less hooked up to
> > any test framework, but I'm attaching a draft to this email. It needs
> > to be pointed at the start of the actual table (after the 16-byte file
> > header).
> > 
> > [...]
> > 
> > static unsigned get32(const char *b0)
> > {
> > 	const unsigned char *b = (const void *)b0;
> > 	return (b[0]<<24) | (b[1]<<16) | (b[2]<<8) | b[3];
> > }
> 
> b[0] is promoted to int before shifting so a bit is shifted into the
> sign position (UB) if b[0] > 0x7f.

yeah this is annoying to do in c, i thought it was fixed in c23,
my preferred way is to use the promoted type explicitly in shifts

static unsigned get32(const void *p)
{
	const unsigned char *u = p;
	unsigned a=u[0], b=u[1], c=u[2], d=u[3];
	return a<<24 | b<<16 | c<<8 | d;
}

btw if this is a mappable format, then wouldn't little-endian repr
be better for most cpus? so get32 is optimized to a single load
(nowadays even unaligned loads are efficient, so compilers emit them)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.