Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230515162234.GL4163@brightrain.aerifal.cx>
Date: Mon, 15 May 2023 12:22:34 -0400
From: Rich Felker <dalias@...c.org>
To: 847567161 <847567161@...com>
Cc: musl <musl@...ts.openwall.com>
Subject: Re: Re: Re: Re: [Patch 1/1] vfprintf: optimize vfprintf
 performance

On Mon, May 15, 2023 at 06:04:48PM +0800, 847567161 wrote:
> Hi,
> 
>   I'm sorry for the incorrect format diff. I updated it.

It's still misformatted, with a mix of HTML entities in the body. Can
you please send it as an attachment instead of inline in the email if
your mail software can't handle plain text non-destructively?

> As we discussed earlier, I made a patch to get nl_arg when we see
> '$' ranther than visit the format anyway.

Interesting -- so you're lazily constructing the nl_arg data when you
first discover the need for it, rather than unconditionally doing the
first pass to look for it. Do you measure a relevant performance
difference doing this?

> Could you please help me review it?

A couple things I noticed:

> ---------------------------------
> diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
> old mode 100644
> new mode 100755
> index 9b961e7f..3294e23b
> --- a/src/stdio/vfprintf.c
> +++ b/src/stdio/vfprintf.c
> @@ -427,7 +427,7 @@ static int getint(char **s) {
>  	return i;
>  }
>  
> -static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg, int *nl_type)
> +static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg, int *nl_type, union arg* nl_arg_ptr)
>  {
>  	char *a, *z, *s=(char *)fmt;
>  	unsigned l10n=0, fl;
> @@ -462,6 +462,12 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg,
>  		if (l) continue;
>  
>  		if (isdigit(s[1]) &amp;&amp; s[2]=='$') {
> +			if (nl_arg_ptr == NULL) {
> +				nl_arg_ptr = nl_arg;
> +				if (printf_core(0, fmt, ap, nl_arg, nl_type, nl_arg_ptr) &lt; 0) {
> +					return -1;
> +				}
> +			}
>  			l10n=1;
>  			argpos = s[1]-'0';
>  			s+=3;

No check has been made that you haven't already consumed arguments via
non-positional-form format specifiers. This will make things blow up
much worse if the caller wrongly mixes them, consuming wrong-type args
from the wrong positions rather than erroring out (as we do now) or
trapping (probably ideal). Right now we don't actually catch all forms
of mixing either, but we do necessarily consume all positional args in
order before attempting to consume any erroneously-specified
non-positional ones. Your patch changes this to happen in the order
things were encountered. So I think, as a first step before even doing
this patch, we should update the l10n flag to a tri-state thing where
it can be yes/no/indeterminate, to allow erroring out on any specifier
that mismatches the existing yes/no.

Stylistically, I like that the patch is minimally invasive and does
not make complex changes (like what I suggested above) together with
the intended functional change. I don't really like building up extra
"special meaning" state like whether nl_arg_ptr is null on top of
whether f is null to represent what mode the printf_core is operating
in, but this can/should probably be cleaned up as a separate patch to
keep the functional change itself simple, like what you've done.

On a technical note, some compilers are bad about hoisting large local
objects out of their scopes when inlining. This results in the large
floating point buffer from fmt_fp getting allocated at the top of
printf_core, and presumably causes a recursive call to printf_core to
double the stack requirements here, which is problematic. I'm not sure
if we should try to avoid recursion (it should be fairly
straightforward to do so) or just try to fix the unwanted hoisting to
begin with (something I've been wanting to do for a while).


> @@ -477,6 +483,12 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg,
>  		/* Read field width */
>  		if (*s=='*') {
>  			if (isdigit(s[1]) &amp;&amp; s[2]=='$') {
> +				if (nl_arg_ptr == NULL) {
> +					nl_arg_ptr = nl_arg;
> +					if (printf_core(0, fmt, ap, nl_arg, nl_type, nl_arg_ptr) &lt; 0) {
> +						return -1;
> +					}
> +				}
>  				l10n=1;
>  				nl_type[s[1]-'0'] = INT;
>  				w = nl_arg[s[1]-'0'].i;
> @@ -491,6 +503,12 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg,
>  		/* Read precision */
>  		if (*s=='.' &amp;&amp; s[1]=='*') {
>  			if (isdigit(s[2]) &amp;&amp; s[3]=='$') {
> +				if (nl_arg_ptr == NULL) {
> +					nl_arg_ptr = nl_arg;
> +					if (printf_core(0, fmt, ap, nl_arg, nl_type, nl_arg_ptr) &lt; 0) {
> +						return -1;
> +					}
> +				}
>  				nl_type[s[2]-'0'] = INT;
>  				p = nl_arg[s[2]-'0'].i;
>  				s+=4;

It looks like these two hunks are almost not needed, since it's not
allowed to mix positional and non-positional forms. However, %m does
not take an argument, but might have width/precision in * form. It'd
be nice if there were some way to avoid this duplication of logic.

> @@ -659,16 +677,13 @@ int vfprintf(FILE *restrict f, const char *restrict fmt, va_list ap)
>  	va_list ap2;
>  	int nl_type[NL_ARGMAX+1] = {0};
>  	union arg nl_arg[NL_ARGMAX+1];
> +	union arg* nl_arg_ptr = NULL;

Useless variable. You could just...

>  	unsigned char internal_buf[80], *saved_buf = 0;
>  	int olderr;
>  	int ret;
>  
>  	/* the copy allows passing va_list* even if va_list is an array */
>  	va_copy(ap2, ap);
> -	if (printf_core(0, fmt, &amp;ap2, nl_arg, nl_type) &lt; 0) {
> -		va_end(ap2);
> -		return -1;
> -	}
>  
>  	FLOCK(f);
>  	olderr = f-&gt;flags &amp; F_ERR;
> @@ -680,7 +695,7 @@ int vfprintf(FILE *restrict f, const char *restrict fmt, va_list ap)
>  		f-&gt;wpos = f-&gt;wbase = f-&gt;wend = 0;
>  	}
>  	if (!f-&gt;wend &amp;&amp; __towrite(f)) ret = -1;
> -	else ret = printf_core(f, fmt, &amp;ap2, nl_arg, nl_type);
> +	else ret = printf_core(f, fmt, &amp;ap2, nl_arg, nl_type, nl_arg_ptr);

...pass a null pointer constant (0) here.

Overall, I think I'm very open to making this change, especially if it
helps performance in some measurable way. With some more changes on
top of it, I think it could simplify and clean up the printf logic
too.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.