![]() |
|
Message-ID: <20250405025003.GP1827@brightrain.aerifal.cx> Date: Fri, 4 Apr 2025 22:50:03 -0400 From: Rich Felker <dalias@...c.org> To: Kang-Che Sung <explorer09@...il.com> Cc: musl@...ts.openwall.com Subject: Re: wcrtomb in UTF-8 locale should check the multibyte state On Sat, Apr 05, 2025 at 07:32:37AM +0800, Kang-Che Sung wrote: > Hi. > > On Sat, Apr 5, 2025 at 5:39 AM Thorsten Glaser <tg@...lvis.org> wrote: > > > > On Sat, 5 Apr 2025, Kang-Che Sung wrote: > > > > >Note: It is _allowed_ in the C standard to reuse an mbstate_t object > > >across different multibyte conversion functions. It is _not an > > > > 7.31.6 begs to differ: > > > > | If an mbstate_t object has been altered by any of the functions > > | described in this subclause, and is then used with a different > > | multibyte character sequence, or in the other conversion direction, or > > | with a different LC_CTYPE category setting than on earlier function > > | calls, the behavior is undefined.414) > > > > I'm aware of that part of the standard paragraph. > I may have read it wrongly regarding the meaning of the "conversion > direction", but I still believe that ignoring the mbstate_t object is > a bad idea. > > I need to make a correction on one thing though: > In macOS, the wcrtomb call in the example code in my last email > actually sets errno=EINVAL, not EILSEQ. > I guess some BSD implementations also follow this (I'm not sure). > POSIX says "EINVAL: ps points to an object that contains an invalid > conversion state." "...the behavior is undefined" means (among other things) there is no obligation to follow any particular error convention. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.