|
Message-ID: <20170913181334.GT1627@brightrain.aerifal.cx> Date: Wed, 13 Sep 2017 14:13:34 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Re: [PATCH] towupper/towlower: Update to Unicode 9.0 On Wed, Sep 13, 2017 at 12:05:19PM +0200, Reini Urban wrote: > Wait a bit with that. I think I found some more Unicode 9.0 issues with the tables, > and I’ve found a huge performance opportunity by sorting the 3 tables (mostly pairs), > and break the loops earlier. > This should come close to glibc table performance then, without the huge memory costs they have. > > I’ll write a perl regression testing script not to miss any more mappings, and maybe > improve the current musl logic. This will need 1-2 days. > I’ll also use it for cperl then. Thanks for the update. I still need to publish the table generation code for all the other tables -- I got it mostly dug up and cleaned up but got interrupted last time so it's still not posted. With that it will be possible to update other things too, not just case mappings. A few of the existing tables are using an older version of the tabulation code that formats the big arrays differently, so I'll probably first make a commit to reformat them, so that it's possible to mechanically check that this commit does not change the generated .o files, then use the uniform formatting as the basis the subsequent update to Unicode 9.0. That should not affect the case mapping file though since it's not machine-generated. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.