|
Message-ID: <20140225223939.GI184@brightrain.aerifal.cx> Date: Tue, 25 Feb 2014 17:39:39 -0500 From: Rich Felker <dalias@...ifal.cx> To: musl@...ts.openwall.com Subject: Re: CP850 & IBM850 codepages On Tue, Feb 25, 2014 at 10:31:46PM +0000, Alan Hourihane wrote: > >Adding cp850 and other DOS codepages should not be hard and should not > >take up much additional size in iconv, but it's also nontrivial to do > >without my tools to generate the tables, which are not published. > >Publishing them is something I should really get around to doing, > >since their absence affects the ability of others to modify the code > >in meaningful ways; I need to apologize for not doing so already. > > > > O.k. that makes sense as I couldn't understand the format. :-) The format is basically this: legacy_chars is a table of all codepoints that ever appear in a supported legacy codepage, with a limit of 1024 total codepoints. The individual codepage tables are 10 bits per entry and map into this table, and they omit the initial subrange that's identical to latin1 (and thus a one-to-one mapping to unicode). I have tools that automatically generate these from the unicode txt files containing the mappings. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.