Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Tue, 14 Nov 2023 14:22:46 +0100
From: Bruno Haible <bruno@...sp.org>
To: musl@...ts.openwall.com
Subject: *printf %lc of L'\0'

Hi,

On 2023-03-21 I noticed a bug with %lc in most libcs:
<https://lists.gnu.org/archive/html/bug-gnulib/2023-03/msg00080.html>.

On 2023-03-28 Eric Blake opened a defect with POSIX, with the intent that
both ISO C and POSIX make the four *printf cases consistent:
<https://austingroupbugs.net/view.php?id=1647>

This issue was then submitted in the ISO C 23 ballot as GB-141,
and in the meeting from 2023-06-20 to 2023-06-23 it was decided upon:
<https://www.open-std.org/JTC1/sc22/wg14/www/docs/n3167.pdf>
page 23, 24. The decision ("option 1") is detailed in
<https://www.open-std.org/JTC1/sc22/wg14/www/docs/n3148.doc>:
  "Option 1 (require a NUL) - change the text to:
   If an l length modifier is present, the wint_t argument is converted
   as if by a call to the wcrtomb function with a pointer to storage of
   at least MB_CUR_MAX bytes, the wint_t argument converted to wchar_t,
   and an initial shift state."

So, ISO C changed, and POSIX will follow suit.

The bug in most libcs is thus no longer a bug.
musl libc, which had it correct, now has a bug.

Test case:
===============================================================================
#include <stdio.h>
#include <string.h>
#include <wchar.h>

int main ()
{
  {
    char buf[12] = { 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD };
    wchar_t two_nuls[2] = { 0, 0 };
    int ret = snprintf (buf, 12, "a%lsz", two_nuls);
    printf ("ret = %d, buf[0] = 0x%x, buf[1] = 0x%x, buf[2] = 0x%x, buf[3] = 0x%x\n",
            ret,
            (unsigned char) buf[0], (unsigned char) buf[1],
            (unsigned char) buf[2], (unsigned char) buf[3]);
  }
  {
    char buf[12] = { 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD, 0xDD };
    int ret = snprintf (buf, 12, "a%lcz", 0);
    printf ("ret = %d, buf[0] = 0x%x, buf[1] = 0x%x, buf[2] = 0x%x, buf[3] = 0x%x\n",
            ret,
            (unsigned char) buf[0], (unsigned char) buf[1],
            (unsigned char) buf[2], (unsigned char) buf[3]);
  }
  return 0;
}
/*
glibc, *BSD, macOS, AIX, Solaris - all correct now:
  ret = 2, buf[0] = 0x61, buf[1] = 0x7a, buf[2] = 0x0, buf[3] = 0xdd
  ret = 3, buf[0] = 0x61, buf[1] = 0x0, buf[2] = 0x7a, buf[3] = 0x0
musl libc - now incorrect:
  ret = 2, buf[0] = 0x61, buf[1] = 0x7a, buf[2] = 0x0, buf[3] = 0xdd
  ret = 2, buf[0] = 0x61, buf[1] = 0x7a, buf[2] = 0x0, buf[3] = 0xdd
*/
===============================================================================

Best regards,

Bruno



Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.