|
Message-ID: <20140727032758.GT4038@brightrain.aerifal.cx> Date: Sat, 26 Jul 2014 23:27:58 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Call for locales maintainer & contributors On Sat, Jul 26, 2014 at 11:27:38PM +0200, Wermut wrote: > Hi > > I don't like the idea of an entirely new tree of locale data written > from scratch. Glibc has one (with a lot of unmaintained data) and then > there is also the CLDR repository which aims to be the central source > for such data, maintained by unicode. The CLDR data is also used as a > basis for the Microsoft and Apple locale files and is often maintained > by national language experts. What I could offer is an effort to write > some magic code that imports the actual CLDR data and converts the > relevant information to the musl formatted ones. The CLDR data is > freely available from: http://cldr.unicode.org/index/downloads I have no objection to using data from CLDR if there's no restrictive license, but at first glance it looks like most of the data is outside the scope of the C/POSIX locale system. What we need is: 1. Weekday and month names (full and abbreviated) - these should almost certainly be available from CLDR or other public sources. 2. Time format strings for strftime - unless CLDR has C-oriented data like that, these might not be available in a form that's easy to automatically adapt. Research on this topic is welcome. 3. Regexes for yes and no responses - seems unlikely to be in CLDR, but again I'd be happy for someone to prove me wrong. 4. Translations of the message strings in libc. Note that musl's strings already deviate some from the legacy strings used on glibc and other systems. For example the strerror strings are adjusted to align more closely with the POSIX description and the actual situations they arise in than the legacy strings (like "Not a typewriter"). I'd like to aim to have our translated strings equally modernized. And before really spending a lot of work on these we should review the English strings again for possible improvements and missing messages (I think some newer error codes may be missing). 5. Collation rules - these almost certainly can come from Unicode/CLDR but musl does not even support collation yet. 6. Monetary formatting and currency names - these almost certain can come from CLDR or other public sources, but again the code to use the data isn't there yet. > Contribution is not completely open, but you normally interested > people get access if they want to. I got mine within a week. > > This is only a suggestion open to discussion. What do you guys think about it? Overall I like it. But I think we still need a maintainer to manage pulling the data, maintaining string translations for messages, etc. Any comments on my items 1-6 above? Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.