|
|
Message-ID: <20140725090649.GN16795@example.net>
Date: Fri, 25 Jul 2014 11:06:49 +0200
From: u-igbb@...ey.se
To: musl@...ts.openwall.com
Subject: Re: Locale bikeshed time
On Thu, Jul 24, 2014 at 06:02:28PM -0400, Rich Felker wrote:
> first you would need an idea of what some "non-language" category
> values might be. I can think of some for LC_COLLATE, though I'm not
> sure how valuable many of them are:
>
> - UCA default tables
> - UTF-16 code unit order
> - Case-insensitive Unicode codepoint order
I can hardly give any opinion on their importance.
> For the other categories, examples seem much harder to find.
> LC_MESSAGES is inherently a language-based category, but perhaps you
> could have a locale that eliminates verbose natural-language messages
> and replaces them with C/POSIX identifiers (e.g. printing ENOENT
> instead of "No such file or directory") conveying the meaning. (Or we
> could be somewhat radical and replace all the internal strerror
> messages like this and require LC_MESSAGES=en to get them back.) I'm
I like this - for clarity, conciseness and for making it as neutral
as possible (ENOENT stems of course from English but no worse than
the keywords of C itself).
> LC_MONETARY, most if not all of the data really corresponds to a
> political unit context, not a language, so in principle it might make
> sense to have locales just for LC_MONETARY that aren't associated with
> a language, but I can't see that being a convenient or reasonable
> design in practice...
Indeed, LC_MONETARY has basically nothing to do with language.
If I might choose I would not let LANG imply LC_MONETARY
(iow would skip LC_MONETARY in language-based locale definitions).
Returning to the naming. As language-based locales are named
after languages, it would be nice to name other kinds of locale
data after their "natural association" too. Then politically-bound
data could be put into the corresponding "territorial" family:
language ll[l][_TT]
territory TT[_ll[l]]
And if we find something that does not feel reasonable to connect
to either a language or a territory, we can do
special cases @<specialcase>
[or ZZ@<specialcase> ("no territory")
or zxx@<specialcase> ("no language")
but the shorter and simpler is to prefer]
The expected mode of usage would be like
LANG=de LC_MONETARY=EU
or
LANG=sv LC_MONETARY=SE
or
LANG=eo@...8601 LC_MONETARY=US@...4217
which would in every case access two locale data files of different
classes, clearly visible in the naming.
Iso date format actually would be a good candidate for a standalone
"@iso8601", but it can as well live inside the C locale.
Then the last example above might look like
LANG=eo LC_TIME=@...8601 LC_MONETARY=US@...4217
at the expense of a third file to be accessed
or rather
LANG=eo LC_TIME=C LC_MONETARY=US@...4217
What do you think about such a naming convention and usage mode?
Rune
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.