|
Message-ID: <20130627025643.242152cf@sibserver.ru> Date: Thu, 27 Jun 2013 02:56:43 +0800 From: orc <orc@...server.ru> To: musl@...ts.openwall.com Subject: Re: Iconv and old codepages Thanks Rich for your quick answer! On Wed, 26 Jun 2013 14:34:32 -0400 Rich Felker <dalias@...ifal.cx> wrote: > On Thu, Jun 27, 2013 at 02:15:39AM +0800, orc wrote: > > Hi, > > > > How many codepages does in-musl iconv supports? > > Currently I'm trying converting from "utf8" to "cp1251" and iconv() > > only gives me a number of "*"'s matching the utf8 input. Is this > > correct behavior and iconv() currently does not support non-UTF > > legacy codepages? Even so, I still see many of them in > > src/locale/codepages.h The (dirty) test program attached. > > > > I also noticed alternative libs thread and corresponding wiki page. > > Does someone know lightweight iconv replacement as a temporary > > measure (other than libiconv for example)? > > Should be fixed in git. In general, the state of musl's iconv is that > the following charsets are supported: > > utf8 > wchart > ucs2 > ucs2be > ucs2le > utf16 > utf16be > utf16le > ucs4 > ucs4be > utf32 > utf32be > ucs4le > utf32le > ascii > usascii > iso646 > iso646us > eucjp > shiftjis > sjis > gb18030 > gbk > gb2312 > iso88591 > latin1 > iso88592 > iso88593 > iso88594 > iso88595 > iso88596 > iso88597 > iso88598 > iso88599 > iso885910 > iso885911 > tis620 > iso885913 > iso885914 > iso885915 > latin9 > iso885916 > cp1250 > windows1250 > cp1251 > windows1251 > cp1252 > windows1252 > cp1253 > windows1253 > cp1254 > windows1254 > cp1255 > windows1255 > cp1256 > windows1256 > cp1257 > windows1257 > cp1258 > windows1258 > koi8r > koi8u So "most major encodings", yep. Thanks, it is fixed and works now. > > Non-alphanumeric characters are ignored in matching charset names, so > all combinations of hyphens and underscores are also supported with > these. > > One caveat which should not affect your usage is that the following > charsets are only supported as the "from" charset, not the "to" > charset: > > eucjp > shiftjis > sjis > gb18030 > gbk > gb2312 > > Until the latest commit, the legacy 8bit codepages were also broken as > the "to" charset, but this breakage was unintentional. While digging trough code I did not noticed that too. > > > Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.