|
Message-ID: <20140724220228.GB4038@brightrain.aerifal.cx> Date: Thu, 24 Jul 2014 18:02:28 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Locale bikeshed time On Thu, Jul 24, 2014 at 10:15:48PM +0200, u-igbb@...ey.se wrote: > On Thu, Jul 24, 2014 at 12:01:50PM -0400, Rich Felker wrote: > > I just meant that language-based locales should match the pattern: > > > > ^[[:lower:]]{2,3}(_[[:upper:]]{2})?([[:punct:]].*)?$ > > > > assuming I didn't make any stupid mistakes in writing that regex. And > > non-language-based locales should not match this pattern. > > I feel it would be somewhat more robust if we'd have a positive > definition for "the second class" of locale data, just in case we one > day discover that we want to differently handle, say, three classes (?) > > A negative defintition gives also very little guidance for the actual > naming and in the worst case may lead to misunderstanding when multiple > parties are involved. > > Why not make such a worst case less probable by a somewhat more strict > naming rule? > Possibly also defining "non-language-based" in a positive way? > > This is just a thought. I have no actual proposal as I do not have a > good mental picture of which kinds of "non-language-based" definitions > exist or should exist and how they are being used or might/should be used. This is a reasonable sentiment, but do you have a proposal? I think first you would need an idea of what some "non-language" category values might be. I can think of some for LC_COLLATE, though I'm not sure how valuable many of them are: - UCA default tables - UTF-16 code unit order - Case-insensitive Unicode codepoint order For the other categories, examples seem much harder to find. LC_MESSAGES is inherently a language-based category, but perhaps you could have a locale that eliminates verbose natural-language messages and replaces them with C/POSIX identifiers (e.g. printing ENOENT instead of "No such file or directory") conveying the meaning. (Or we could be somewhat radical and replace all the internal strerror messages like this and require LC_MESSAGES=en to get them back.) I'm not sure if there would be interesting LC_TIME locales not associated with a language (since LC_TIME has to offer day/month names). And for LC_MONETARY, most if not all of the data really corresponds to a political unit context, not a language, so in principle it might make sense to have locales just for LC_MONETARY that aren't associated with a language, but I can't see that being a convenient or reasonable design in practice... Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.