Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <c23a73f5-34b4-99e8-786f-622ae42d41e8@gmail.com>
Date: Tue, 18 Jul 2017 23:05:29 +0300
From: Mikhail Kremnyov <mkremnyov@...il.com>
To: musl@...ts.openwall.com
Subject: Issues in mbsnrtowcs and wcsnrtombs

Hi,

It looks like there are some bugs in the implementations of mbsnrtowcs
and wcsnrtombs.
E.g. inside mbsnrtowcs there is this code:

    while ( s && wn && ( (n2=n/4)>=wn || n2>32 ) ) {
        if (n2>=wn) n2=wn;
        n -= n2;
        l = mbsrtowcs(ws, &s, n2, st);

Here "n" is the number of source bytes to convert and "n2" is the number
of wide chars that may be put to the destination, so it's incorrect to
subtract one from another. And indeed a simple test shows that the
function doesn't work correctly if long enough non-ascii string is
passed to it. E.g.:

    const std::string origStr =
u8"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ";
    const std::string srcStr = origStr + u8"їґіє";

    std::mbstate_t st = {};
    const char* srcPtr = &srcStr[0];
    std::wstring dest(srcStr.length() + 1, wchar_t(0));

    auto res = mbsnrtowcs(&dest[0], &srcPtr, origStr.length(),
dest.length(), &st);

    std::cout << "res = " << res << ", srcPtr = " << (void*)srcPtr <<
std::endl;

And the output is:
    res = 70, srcPtr = 0

Here mbsnrtowcs was told to convert only "origStr.length()" number of
bytes, which contain 66 2-byte characters, but it converted 70, stopping
only after the zero char was met.

A similar problem happens with wcsnrtombs using a slightly longer string:

    std::wstring srcStr =
L"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдеёжзийклмнопрстуфхцчшщъыьэюя";

    const wchar_t* srcPtr = &srcStr[0];
    std::mbstate_t st = {};
    std::string dest(srcStr.length() * 4 + 1, char(0));

    auto res = wcsnrtombs(&dest[0], &srcPtr, srcStr.length(),
dest.length(), &st);

    std::cout << "res = " << res << ", dest = " << dest << std::endl;

The output:
    res = 98, dest = абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНО
   
I.e. it only converted 49 characters instead of 99.


Mikhail.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.