|
Message-ID: <CAJgzZorMYDML8NT4p7sX5DMMrT5+2=OZ+6soTCt2+=inHC8AOQ@mail.gmail.com> Date: Fri, 28 Jan 2022 10:33:53 -0800 From: enh <enh@...gle.com> To: Rich Felker <dalias@...c.org> Cc: musl@...ts.openwall.com Subject: Re: A journey of weird file sorting and desktop systems On Fri, Jan 28, 2022 at 10:01 AM Rich Felker <dalias@...c.org> wrote: > > On Fri, Jan 28, 2022 at 08:58:30AM -0800, enh wrote: > > (Android's libc maintainer here...) > > > > i'd argue this isn't a musl bug. on Android we make a clear distinction between: > > > > 1. libc's responsibilities which, to paraphrase rich, are basically > > "be unsurprising because your audience is OS/app developers who don't > > speak all the languages their users use anyway". that is: "code point > > order". > > That's not what I said. I speculated that part of the difficulty with > getting people to care is that a large number of users personally > prefer LC_COLLATE=C. Not that we should punt because of that. > > > 2. icu's responsibilities which cover all the user-facing (as opposed > > to developer-facing) stuff. i18n is *hard* and the C/POSIX APIs are, > > to be blunt, not fit for *that* purpose. there's a reason why all of > > Android/macOS/Windows (and all the browsers) ship copies of icu. > > ICU is really, *really* bad. I don't want to be encouraging people to > use it because basic functionality is missing from libc. human languages are really really messy. a lot of the complexity is inherent. as for the non-inherent, https://github.com/unicode-org/icu4x seems like a good start. > > the bug here is that a desktop file manager is assuming "i just want > > telephone book order --- how hard can it be?". the answer turns out to > > be "hard". especially when you get into fun stuff like users who *do* > > speak multiple languages and have strong expectations for how they > > sort. or places where there are multiple sort orders in common use. > > Absolutely. That's why I don't want to treat the problem half-assedly, but that's my point --- it's not the *implementation* that's the issue, it's that the C/POSIX *interfaces* are insufficient. the bar on how good a job you _can_ do within those constraints is horribly low. > but make sure we design or choose a format for the collation tables > that's simultaneously (1) efficient, (2) sufficiently expressive to > give the behaviors users may want, and (3) easy enough to understand > that users can customize it if needed. The POSIX localedef format (an > option group musl intentionally does not support) does not have any of > those properties except maybe #2. The standard Unicode format may > translate directly into something that can meet all 3; I'm not sure. > > Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.