|
|
Message-ID: <20230602143759.GQ4163@brightrain.aerifal.cx>
Date: Fri, 2 Jun 2023 10:38:00 -0400
From: Rich Felker <dalias@...c.org>
To: Jens Gustedt <Jens.Gustedt@...ia.fr>
Cc: musl@...ts.openwall.com, jens.gustedt@...teo.eu
Subject: Re: [C23 printf 2/3] C23: implement the wN length specifiers
for printf
On Tue, May 30, 2023 at 08:55:27AM +0200, Jens Gustedt wrote:
> These are mandatory for C23 and concern all types for which the
> platform has `int_leastN_t` and `uint_leastN_t`. For musl these types
> always coincide with `intN_t` and `uintN_t` and are always present for
> N equal 8, 16, 32 and 64.
>
> They can be added for general use since all lowercase letters were
> previously reserved.
>
> Nevertheless, users that use these modifiers will see a lot of
> warnings from compilers in the beginning. This is because the
> compilers have not yet integrated this form of a specifier into their
> correponding extensions (gcc attributes). So unfortunately also
> testing this feature may be a bit noisy for the moment.
>
> The only architecture dependend choice is the type for N == 64, which
> may be `long` or `long long`. We just mimick the test that is done in
> other places to compare `UINTPTR_MAX` and `UINT64_MAX` to determine
> that.
> ---
> src/stdio/vfprintf.c | 28 +++++++++++++++++++++++++---
> src/stdio/vfwprintf.c | 28 +++++++++++++++++++++++++---
> 2 files changed, 50 insertions(+), 6 deletions(-)
>
> diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
> index cbc79783..265fb7ad 100644
> --- a/src/stdio/vfprintf.c
> +++ b/src/stdio/vfprintf.c
> @@ -33,7 +33,7 @@
>
> enum {
> BARE, LPRE, LLPRE, HPRE, HHPRE, BIGLPRE,
> - ZTPRE, JPRE,
> + ZTPRE, JPRE, WPRE,
> STOP,
> PTR, INT, UINT, ULLONG,
> LONG, ULONG,
> @@ -57,7 +57,7 @@ static const unsigned char states[]['z'-'A'+1] = {
> S('s') = PTR, S('S') = PTR, S('p') = UIPTR, S('n') = PTR,
> S('m') = NOARG,
> S('l') = LPRE, S('h') = HPRE, S('L') = BIGLPRE,
> - S('z') = ZTPRE, S('j') = JPRE, S('t') = ZTPRE,
> + S('z') = ZTPRE, S('j') = JPRE, S('t') = ZTPRE, S('w') = WPRE,
> }, { /* 1: l-prefixed */
> S('b') = ULONG, S('B') = ULONG,
> S('d') = LONG, S('i') = LONG,
> @@ -101,6 +101,12 @@ static const unsigned char states[]['z'-'A'+1] = {
> S('o') = UMAX, S('u') = UMAX,
> S('x') = UMAX, S('X') = UMAX,
> S('n') = PTR,
> + }, { /* 8: w-prefixed */
> + S('b') = UINT, S('B') = UINT,
> + S('d') = INT, S('i') = INT,
> + S('o') = UINT, S('u') = UINT,
> + S('x') = UINT, S('X') = UINT,
> + S('n') = PTR,
> }
> };
>
> @@ -447,7 +453,7 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg,
> int w, p, xp;
> union arg arg;
> int argpos;
> - unsigned st, ps;
> + unsigned st, ps, width=0;
> int cnt=0, l=0;
> size_t i;
> char buf[sizeof(uintmax_t)*CHAR_BIT+3+LDBL_MANT_DIG/4];
> @@ -527,9 +533,25 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg,
> if (OOB(*s)) goto inval;
> ps=st;
> st=states[st]S(*s++);
> + if (st == WPRE) {
> + if (*s == '0') goto inval;
> + width = getint(&s);
> + }
> } while (st-1<STOP);
> if (!st) goto inval;
>
> + if (ps == WPRE) switch (width) {
> + case 8: ps = HHPRE; st = (st == UINT) ? UCHAR : ((st == INT) ? CHAR : PTR); break;
> + case 16: ps = HPRE; st = (st == UINT) ? USHORT : ((st == INT) ? SHORT : PTR); break;
> + case 32: ps = BARE; break;
> +#if UINTPTR_MAX >= UINT64_MAX
> + case 64: ps = LPRE; st = (st == UINT) ? ULONG : ((st == INT) ? LONG : PTR); break;
> +#else
> + case 64: ps = LLPRE; st = (st == UINT) ? ULLONG : ((st == INT) ? LLONG : PTR); break;
> +#endif
The logic here for whether w64 is LPRE or LLPRE is wrong. The
condition for whether [u]int64_t is long or long long is not whether
uintptr_t or long long would be too wide (they never are, in our ABIs)
but whether long is long enough. All our [u]intN_t are the lowest-rank
type of the desired size, and this seems to be consistent with what
other implementations like glibc do.
Short of a compelling reason not to, though, I plan to just go with
the fix-up of my v2 patch for this functionality. Its ordering avoids
duplication of logic (the ?: branches in the cases above) and the need
for maintaining extra side state (the width variable) thru the state
machine.
Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.