Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140722203540.GA11570@brightrain.aerifal.cx>
Date: Tue, 22 Jul 2014 16:35:40 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Locale bikeshed time

On Tue, Jul 22, 2014 at 10:10:08PM +0200, u-igbb@...ey.se wrote:
> A musl-specific variable name would be a better/cleaner choice.

One question is whether this is really musl-specific or specific to a
locale scheme that could be used outside of musl too. However, either
way it's probably appropriate for the variable to be musl-specific.
Having one variable configure multiple things is usually error-prone
and inflexible.

> > The second issue is how locale categories are split up. Glibc has each
> > category in a separate file, except for the "locale-archive" file
> > which stores everything in one file for easy mapping. My leaning so
> 
> By the way, please do not follow the way of a single big file.
> For systems which rely on file boundaries to reflect data clustering
> (i.e. which data is most probable to be used together) it is very useful
> to let the files correspond to the data structure. Otherwise some cheap
> and efficient distributed data access optimizations become impossible.

I hadn't even considered this aspect, but I think the whole concept of
a single big file is undesirable with data that's naturally subject to
change over time, and where the data comes from multiple sources. So I
wasn't really considering that option anyway.

> > far is to put the whole locale -- time format and translations,
> > message translations, ... in a single file. This avoids the need for
> > multiple mappings (and syscall overhead, and vma overhead, ...) if
> > you're using the same value for all categories. But on the other hand,
> > if you wanted to have lots of subtle variants of a locale, you might
> > end up with largely-duplicate files on disk. Fortunately I think
> > they'll all be very small anyway so this may not matter.
> 
> I actually do mix categories from different locales.
> No problem as long as the files are small.

Note that if you're just mixing "ll_TT" and "C", there wouldn't be any
cost anyway since the C locale (and its aliases) are builtin and never
loaded from a file. Where I was thinking you might see duplication is
for things like: LC_ALL=ll_TT@...ifier where modifier is really just
an alternate for one category (e.g. ISO date format for time, alt
collation order, etc.), but the file ends up storing duplicates of all
the data from other categories. However, I think the alternate
preferred usage here would be to provide a file for just the category
being overridden that does not contain the base data and require users
to set the individual categories, like what you're doing, e.g.

LANG=ll_TT LC_TIME=ll_TT@...date

rather than:

LC_ALL=ll_TT@...date

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.