|
Message-ID: <530DD6F9.3040301@fairlite.co.uk> Date: Wed, 26 Feb 2014 11:58:49 +0000 From: Alan Hourihane <alanh@...rlite.co.uk> To: musl@...ts.openwall.com Subject: Re: CP850 & IBM850 codepages On 02/25/14 22:39, Rich Felker wrote: > On Tue, Feb 25, 2014 at 10:31:46PM +0000, Alan Hourihane wrote: >>> Adding cp850 and other DOS codepages should not be hard and should not >>> take up much additional size in iconv, but it's also nontrivial to do >>> without my tools to generate the tables, which are not published. >>> Publishing them is something I should really get around to doing, >>> since their absence affects the ability of others to modify the code >>> in meaningful ways; I need to apologize for not doing so already. >>> >> O.k. that makes sense as I couldn't understand the format. :-) > The format is basically this: legacy_chars is a table of all > codepoints that ever appear in a supported legacy codepage, with a > limit of 1024 total codepoints. The individual codepage tables are 10 > bits per entry and map into this table, and they omit the initial > subrange that's identical to latin1 (and thus a one-to-one mapping to > unicode). I have tools that automatically generate these from the > unicode txt files containing the mappings. > Thanks Rich. I'll keep an eye out for the cp850/ibm850 table to land when you've had chance with your tools. Alan.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.