Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230530014822.GW4163@brightrain.aerifal.cx>
Date: Mon, 29 May 2023 21:48:22 -0400
From: Rich Felker <dalias@...c.org>
To: Jₑₙₛ Gustedt <jens.gustedt@...ia.fr>
Cc: musl@...ts.openwall.com
Subject: Re: [C23 printf 2/3] C23: implement the wN length specifiers
 for printf

On Mon, May 29, 2023 at 09:21:55PM +0200, Jₑₙₛ Gustedt wrote:
> Rich,
> 
> on Mon, 29 May 2023 11:46:40 -0400 you (Rich Felker <dalias@...c.org>)
> wrote:
> 
> > On Mon, May 29, 2023 at 09:14:13AM +0200, Jₑₙₛ Gustedt wrote:
> > > Rich,
> > > 
> > > on Fri, 26 May 2023 17:03:58 -0400 you (Rich Felker
> > > <dalias@...c.org>) wrote:
> > >   
> > > > I think you need an extra state that's "plain but not bare" that
> > > > duplicates only the integer transitions out of it, like the l, ll,
> > > > etc. prefix states do.  
> > > 
> > > Hm, the problem is that for the other prefixes the table entries
> > > then encode the concrete type that is to be expected. We could not
> > > do this here because the type depends on the requested width. So we
> > > would then need to "repair" that type after the loop. A `switch` to
> > > do that would look substantially similar to what is there, now. Do
> > > you think that would be better?  
> > 
> > OK I think I can communicate better with code than natural language
> > text, so here's a diff, completely untested, of what I had in mind.
> 
> that's ... ugh ... not so prety, I think
> 
> In my current version I track the desired width, if there is w
> specifier, and then do the adjustments after the loop. That takes
> indeed care of undefined character sequences.
> 
> I find that much better readable, and also easier to extend (later
> there comes the `wf` case and the `128`, and perhaps some day `256`)

It sounds like the core issue is that you don't like the state machine
approach to how musl's printf processes format specifiers. Personally,
I like it because there's an obvious structured way to validate that
it's accepting exactly the right things and nothing else, vs an
approach like what you tried where you ended up accepting a lot of
bogus specifiers.

One alternative I would consider is doing something like what you did,
but moving it outside of/before the state machine loop, so it's not
mixing the w* processing with the state machine. This avoids accepting
bogus repeated w32 prefixes and similar (because there is no loop) and
lets you get by with just adding one PLAIN state to have it start in
(rather than BARE) after w32. I expect the overall size would be
similar. Concept attached.

Rich

View attachment "printf-wprefix-alt.diff" of type "text/plain" (1830 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.