Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20140726060411.GA20089@brightrain.aerifal.cx>
Date: Sat, 26 Jul 2014 02:04:11 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: nl_langinfo and .mo-based locale files

While working on getting the locale changes ready to commit, I noticed
that some of the strings nl_langinfo uses are "problematic" to pass to
a gettext-style translation function:

static const char c_time[] =
	"Sun\0" "Mon\0" "Tue\0" "Wed\0" "Thu\0" "Fri\0" "Sat\0"
	"Sunday\0" "Monday\0" "Tuesday\0" "Wednesday\0"
	"Thursday\0" "Friday\0" "Saturday\0"
	"Jan\0" "Feb\0" "Mar\0" "Apr\0" "May\0" "Jun\0"
	"Jul\0" "Aug\0" "Sep\0" "Oct\0" "Nov\0" "Dec\0"
	"January\0"   "February\0" "March\0"    "April\0"
	"May\0"       "June\0"     "July\0"     "August\0"
	"September\0" "October\0"  "November\0" "December\0"
	"AM\0" "PM\0"
	"%a %b %e %T %Y\0"
	"%m/%d/%y\0"
	"%H:%M:%S\0"
	"%I:%M:%S %p\0"
	"\0"
	"%m/%d/%y\0"
	"0123456789"
	"%a %b %e %T %Y\0"
	"%H:%M:%S";

In particular, LC_TIME has a few duplicates (these could probably be
fixed), and one empty string (particularly ugly to handle).

One idea for dealing with the issue is simply having a single
translation for the whole "multi-string" above. The key could be
either "LC_TIME" or similar, or we could just use c_time (which, as a
key, reduces to "Sun", so there's no concern that the key might
change) as the key. This latter approach would be clean at the source
level since we could just pass c_time directly to the translation
function and get it back to use (with all the multi-string components)
if there's no translation.

Does anyone object to that and think it's too ugly/hackish? Keep in
mind the order of the components is fixed (it's part of the ABI from
langinfo.h constants). In some ways it's nice because it keeps all of
the associated data together, but one could say it's ugly because it
exposes the implementation of nl_langinfo (stepping N times though a
"multi-string") to translators.

The other alternative I see, which might be better, is simply putting
dummy values (e.g. the strings "ERA", "ERA_D_FMT", etc.) in their
slots in c_time and suppressing them in the output if the C locale is
active for LC_TIME or if there's no translation for them.

Any other ideas?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.