|
Message-ID: <c2c7d1ce-1e0d-a504-f8be-313fe7385240@gmail.com> Date: Mon, 10 Jul 2017 10:22:37 +0200 From: Bartosz Brachaczek <b.brachaczek@...il.com> To: musl@...ts.openwall.com Subject: Re: [PATCH] handle whitespace before %% in scanf Hello, On 7/10/2017 4:00 AM, Rich Felker wrote: > On Sun, Jul 09, 2017 at 11:00:18PM +0200, Bartosz Brachaczek wrote: >> this is mandated by C and POSIX standards and is in accordance with > ^^^^ >> glibc behavior. > > Can you explain exactly what "this" refers to? Ah, poor wording choice on my part. Yes, I meant that %% consumes whitespace. Shall I resend the patch with restated commit message if you think it's otherwise good? > It looks like you're claiming %% consumes space, which I can't find > any support for in the C standard. Has this topic been discussed > somewhere I should see? Sorry, I didn't think this would be controversial. No prior discussion. Let me present my reasoning below. The following paragraph in the description of the fscanf function in the C11 standard, §7.21.6.2, establishes that '%%' is a "conversion specification", where '%' is the "conversion specifier": > The format shall be a multibyte character sequence, beginning and > ending in its initial shift state. The format is composed of zero or > more directives: one or more white-space characters, an ordinary > multibyte character (neither '%' nor a white-space character), or a > conversion specification. Each conversion specification is introduced > by the character '%'. After the '%', the following appear in sequence: > > -- . . . > > -- A "conversion specifier" character that specifies the type of > conversion to be applied. That '%' is a valid conversion specifier is established a few paragraphs below: > The conversion specifiers and their meanings are: > > . . . > > '%' Matches a single '%' character; no conversion or assignment > occurs. The complete conversion specification shall be '%%'. Between the above paragraphs, there is a definition of how a conversion specification is executed: > A directive that is a conversion specification defines a set of matching > input sequences, as described below for each specifier. A conversion > specification is executed in the following steps: > > Input white-space characters (as specified by the 'isspace' function) > are skipped, unless the specification includes a '[', 'c', or 'n' > specifier. > > . . . From the above I conclude that all conversion specifications, except '%[', '%c', and '%n', consume whitespace. This includes the '%%' conversion specification. The above can be applied just as well to C99. However, C11 added a new example (still in §7.21.6.2) that seems to confirm my reading of the normative text: > EXAMPLE 5 The call: > > #include <stdio.h> > /* ... */ > int n, i; > n = sscanf("foo % bar 42", "foo%%bar%d", &i); > > will assign to 'n' the value 1 and to 'i' the value 42 because input > white-space characters are skipped for both the '%' and 'd' conversion > specifiers. Now, the code in the example is clearly broken, as either the format string should be "foo%% bar%d" or the input string should be "foo %bar 42", but the explanation does imply that '%%' consumes whitespace. Bartosz
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.