Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <tencent_26B8E38FF26E364CE34C825585599E65E009@qq.com>
Date: Tue, 16 May 2023 17:32:51 +0800
From: "847567161" <847567161@...com>
To: "musl" <musl@...ts.openwall.com>
Subject: Re:Re: Re: Re: Re: [Patch 1/1] vfprintf: optimize vfprintf performance

&gt; Interesting -- so you're lazily constructing the nl_arg data when you
&gt; first discover the need for it, rather than unconditionally doing the
&gt; first pass to look for it. Do you measure a relevant performance
&gt; difference doing this?

Yes, the benchmark show there are about  30% benefits for commonly used format string.

BM_stdio_printf_s, 160 ns(before opt), 110 ns (after opt)

static void BM_stdio_printf_s(benchmark::State&amp; state) {                                                         
  while (state.KeepRunning()) {                                                                                 
        char buf[BUFSIZ];                                                                                           
        snprintf(buf, sizeof(buf), "this is a more typical error message with detail: %s",                           
                "No such file or directory");
    }                                                                                                             
}                                                                                                               
MUSL_BENCHMARK(BM_stdio_printf_s);


&gt; Stylistically, I like that the patch is minimally invasive and does
&gt; not make complex changes (like what I suggested above) together with
&gt; the intended functional change. I don't really like building up extra
&gt; "special meaning" state like whether nl_arg_ptr is null on top of
&gt; whether f is null to represent what mode the printf_core is operating
&gt; in, but this can/should probably be cleaned up as a separate patch to
&gt; keep the functional change itself simple, like what you've done.

I add nl_arg_filled to indicate whether we have completed the acquisition of parameters.

&gt; On a technical note, some compilers are bad about hoisting large local
&gt; objects out of their scopes when inlining. This results in the large
&gt; floating point buffer from fmt_fp getting allocated at the top of
&gt; printf_core, and presumably causes a recursive call to printf_core to
&gt; double the stack requirements here, which is problematic. I'm not sure
&gt; if we should try to avoid recursion (it should be fairly
&gt; straightforward to do so) or just try to fix the unwanted hoisting to
&gt; begin with (something I've been wanting to do for a while).

I modified buf to be declared when using ranther than at the top of printf_core.

&gt; It looks like these two hunks are almost not needed, since it's not
&gt; allowed to mix positional and non-positional forms.

Thanks, I had changed it in patch.
So musl don't support these format strings, right?
printf("yc test: %*2$d \n", 2, 10);
printf("yc test: %.*2$f \n", 1.123456, 10);

Does "%n$d", "%*n$d" or "%.*n$d" should be used together?

&gt; However, %m does
&gt; not take an argument, but might have width/precision in * form. It'd
&gt; be nice if there were some way to avoid this duplication of logic.

Do you mean "%*m$" or "%.*m$"?
What do you mean "duplication of logic"? I don't get it.

&gt; No check has been made that you haven't already consumed arguments via
&gt; non-positional-form format specifiers. This will make things blow up
&gt; much worse if the caller wrongly mixes them, consuming wrong-type args
&gt; from the wrong positions rather than erroring out (as we do now) or
&gt; trapping (probably ideal). Right now we don't actually catch all forms
&gt; of mixing either, but we do necessarily consume all positional args in
&gt; order before attempting to consume any erroneously-specified
&gt; non-positional ones. Your patch changes this to happen in the order
&gt; things were encountered. So I think, as a first step before even doing
&gt; this patch, we should update the l10n flag to a tri-state thing where
&gt; it can be yes/no/indeterminate, to allow erroring out on any specifier
&gt; that mismatches the existing yes/no.

I remember you mentioned "calling printf with an invalid format
string has undefined behavior" here, so maybe it's also ok.

https://www.openwall.com/lists/musl/2023/05/07/1


Chuang Yin
Download attachment "optimize_printf_core.diff" of type "application/octet-stream" (1970 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.