|
|
Message-ID: <op.w1fqobr8dyj81a@monster.itedn32a.localdomain>
Date: Wed, 07 Aug 2013 15:20:25 +0800
From: Roy <roytam@...il.com>
To: musl@...ts.openwall.com
Subject: Re: Re: Re: Re: iconv Korean and Traditional Chinese research so far
On Wed, 07 Aug 2013 08:54:35 +0800, Roy <roytam@...il.com> wrote:
[snip]
>
> Big5-HKSCS 2004 map for reference:
> http://moztw.org/docs/big5/table/hkscs2004.txt
> Use sed and awk to create b2u.txt for comparing:
> $ sed -e '/^==/d' -e '1,2d' hkscs2004.txt| awk 'BEGIN{print "# big5
> unicode"}{print "0x" $1 " 0x" $4}' > hkscs2004-b2u.txt
> In result:
> http://roy.dnsd.me/hkscs2004-b2u.txt
>
> And finally the diff:
> http://roy.dnsd.me/uao250-hkscs2004.diff
>
> The diff is huge so separated table is needed.
I forgot that the HKSCS table has original CP950 entries missing.
$ cat cp950-b2u.txt hkscs2004-b2u.txt | sed -e '1d'|sort >
hkscs2004-big5-b2u.txt
And I wrote a small utility in PHP to compare 2 tables by keys(first
column):
http://roy.dnsd.me/tbldiff.phps
$ php tbldiff.php uao250-b2u.txt hkscs2004-big5-b2u.txt >
uao250-vs-hkscs2004.txt
http://roy.dnsd.me/uao250-vs-hkscs2004.txt
$ sed -e '/==/d' uao250-vs-hkscs2004.txt > uao250-hkscs2004-diff.txt
http://roy.dnsd.me/uao250-hkscs2004-diff.txt
So 5965 mappings are different, including 1379 mappings does not exist in
HKSCS2004.
But since there is mix-usage of HKSCS2001/2004 in both local files and
Internet pages, the condition of HKSCS become worse.
BTW, There is another NLS hack that hacks MS-CP932 to support JIS X
0213:2004
http://www.eonet.ne.jp/~kotobukispace/ddt/jisx0213/jisx0213.html
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.