Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140225223939.GI184@brightrain.aerifal.cx>
Date: Tue, 25 Feb 2014 17:39:39 -0500
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: CP850 & IBM850 codepages

On Tue, Feb 25, 2014 at 10:31:46PM +0000, Alan Hourihane wrote:
> >Adding cp850 and other DOS codepages should not be hard and should not
> >take up much additional size in iconv, but it's also nontrivial to do
> >without my tools to generate the tables, which are not published.
> >Publishing them is something I should really get around to doing,
> >since their absence affects the ability of others to modify the code
> >in meaningful ways; I need to apologize for not doing so already.
> >
> 
> O.k. that makes sense as I couldn't understand the format. :-)

The format is basically this: legacy_chars is a table of all
codepoints that ever appear in a supported legacy codepage, with a
limit of 1024 total codepoints. The individual codepage tables are 10
bits per entry and map into this table, and they omit the initial
subrange that's identical to latin1 (and thus a one-to-one mapping to
unicode). I have tools that automatically generate these from the
unicode txt files containing the mappings.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.