Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260430182159.GT1827@brightrain.aerifal.cx>
Date: Thu, 30 Apr 2026 14:22:00 -0400
From: Rich Felker <dalias@...c.org>
To: Pablo Correa Gomez <pabloyoyoista@...tmarketos.org>
Cc: musl@...ts.openwall.com
Subject: Re: Updated dumplocale/source format [Re: Selecting locale
 source format]

On Thu, Apr 30, 2026 at 07:54:25PM +0200, Pablo Correa Gomez wrote:
> El Mon, 20-04-2026 a las 13:44 +0200, Pablo Correa Gomez escribió:
> > > 
> > I have used the provided dumplocale.c file to transform the current locales
> > into
> > the new source format. It can all be found
> > in https://gitlab.postmarketos.org/postmarketOS/musl-locales Generally the
> > whole
> > thing was pretty straight-forward, and clearly it now allowed to fix the
> > infamous "May" bug:
> > https://gitlab.postmarketos.org/postmarketOS/musl-locales/-
> > /commit/374ea7d0164efcf1bc1f14701b1333a943837bd7
> > 
> > Of course, the "May" bug is still present in all translations (but Spanish, my
> > native language, that I have manually fixed), and things like the
> > differentiation between H_ and H0 are not there either, since they were not
> > there in the previous translations.
> > 
> > I will start poking translators about this, to see if we find any issue that
> > we
> > didn't find earlier.
> > 
> > Best,
> > Pablo
> > > 
> We've gotten quite some good feedback from the translators already. So far,
> there are 2 questions that have come up as the most salient ones:
> 
> First, documentation on the keys to translate. I found good documentation for
> the standard POSIX keys in[1] would be good to know if that's a good
> authoritative source. However, I could not find such documentation for the error
> codes in the LC_MESSAGES section.

For the errno codes, the POSIX text is in XSH 2.3:

https://pubs.opengroup.org/onlinepubs/9799919799/functions/V2_chap02.html#tag_16_03

For gai_strerror:

https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/netdb.h.html

The h_errno codes are not documented in POSIX anymore because the
interfaces were deprecated and removed something like 2 versions ago.
However, their contexts are basically the same as the corresponding
EAI_* codes.

For regerror codes:

https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/regex.h.html

> Most specifically, a translator had a question
> abuot EAI_OVERFLOW, which in English is just "Overflow". Regardless of whether
> the English string had to be improved, would be nice to have a good
> authorizative source for those. Everything I could find were manual pages for
> getnameinfo

Yes, EAI_OVERFLOW only occurs for getnameinfo where the caller has
passed a buffer too small to fit the reverse-resolved hostname (or the
numeric IP literal string). Our current English text "Overflow" is
rather unhelpful. We should probably chose something more descriptive
to help both users and translators (altho it's really not intended to
be an error code exposed to users; the intent is that applications
retry with a larger buffer).

> Second, I was asked about the first day of the week. It seems that is a glibc
> extension that can be queried passig "_NL_TIME_FIRST_WEEKDAY" to nl_langinfo[2].
> Although this is obviously not in POSIX, I really wonder why is it not, when
> it's something clearly tied to the system language. Indeed, I dag a bit into it,
> only to realize that it works in my system due to the toolkit I used the most
> having some ugly hack[3] working around this limitation. I would be pretty
> interested if we could make sure that the source and binary format have enough
> flexibility to support adding such potential use-case in the future, would POSIX
> be updated to support such useful situation.

The format is certainly flexible enough to be able to support
extensions like this in the future. We'd have to choose a keyword for
it in the source file, but numeric value of the symbolic constant
_NL_TIME_FIRST_WEEKDAY just becomes the path component in the binary
file.

Unfortunately, the situation on whether these are intended to be
public interfaces even in glibc is kinda unclear. They're prefixed
with _ which suggests they're not intended to be public, and I can't
find any documentation suggesting they intend you to use them except
for the enums in the header.

If we can resolve that, getting consensus with other implementations
on what's meant to be exposed, I'm not opposed to adding some
extensions like this. It would be helpful then to have the assistance
of folks with localization experience to evaluate what makes sense to
support. But since the binary format is sufficiently flexible to
express these if we want to, I don't think we need to treat making any
decisions here as in-scope for the current project. Just confirming
that we have the flexibility to add things like this as-needed
satisfies the goal of having something future-proof.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.