|
Message-ID: <20231216231037.GG4163@brightrain.aerifal.cx> Date: Sat, 16 Dec 2023 18:10:37 -0500 From: Rich Felker <dalias@...c.org> To: Pablo Correa Gómez <pabloyoyoista@...tmarketos.org> Cc: musl@...ts.openwall.com, Pablo Correa Gómez <ablocorrea@...mail.com> Subject: Re: [PATCH 0/2] Support printing localized RADIXCHAR On Sat, Dec 16, 2023 at 08:36:42PM +0100, Pablo Correa Gómez wrote: > From: Pablo Correa Gómez <ablocorrea@...mail.com> > > Since we've been discussing about translations, I've been looking a bit > around, and have found some low-hanging fruit, in the form of improving > printf-family output for localized systems. > > I've tried to do the same for strtof family of functions, but I was not > completely sure on how to approach that. Forcing the radix char there > has the problem that numeric values as written for programming stop > being supported, and treating equally a "." and the localized case seems > to not be supported by POSIX. Does anybody have any thoughts about this? > Without that, this patch series might be a bit incomplete, since > certain localized printf outputs would not be possible to ingest in > strtof. Although I'm also unequally unsure if that's a requirement > > Pablo Correa Gómez (2): > langinfo: add support for LC_NUMERIC translations > printf: translate RADIXCHAR for floating-point numbers > > src/locale/langinfo.c | 2 +- > src/stdio/vfprintf.c | 5 +++-- > 2 files changed, 4 insertions(+), 3 deletions(-) > > -- > 2.43.0 This is a topic that's been controversial. I have always been against having variable radix character, but I've also been seeking input from users who want localized output whether the lack of this functionality is a serious problem that needs revisiting. Last time it was discussed, I believe my position was that, if we do this, it needs to be a 1-bit setting, where a locale necessarily has either '.' or ',' as the radix. No other values actually appear in real-world conventions, and on other implementations such as glibc, the allowance for arbitrary characters allows doing some ~nasty~ stuff with output and input processing. For example, you could define the radix character to be '1' or something that makes conversions fail to round-trip. As written to support arbitrary radix characters, the patch also fails to handle the case where the radix character is multi-byte, copying only a single byte of it and thereby producing broken output. This is actually a nasty case where printf semantics for field width are not what the caller is likely to expect, and it breaks our wide printf implementation, which assumes when it uses byte-based printf for numbers that the byte count and character count are the same. Supporting only '.' and ',' avoids all of these issues, too. Another detail you've overlooked is that scanf/strto{d,ld,f}/atof need to process the radix point character. This in turn requires making the _l wrappers for strto{d,ld,f} so that they actually apply the locale argument rather than ignoring it. Before proceeding on all of this we should probably try to reach a decision on whether it's really needed/wanted functionality. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.