Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 31 May 2018 19:44:36 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: Re: [PATCH] scanf: handle the L modifier for integers

On Thu, May 31, 2018 at 10:44:42PM +0200, Natanael Copa wrote:
> On Thu, 31 May 2018 12:00:22 -0700
> Andrei Vagin <avagin@...tuozzo.com> wrote:
> 
> > >>Without this patch, ret will be 1 and mask will be 0. It is obviously
> > >>incorrect. According to the man page, L should work like ll:
> > >>
> > >>L Indicates that the conversion will be either e, f, or g and the
> > >>   next pointer is a pointer to long double or the conversion will
> > >>   be d, i, o, u, or x and the next pointer is a pointer to long
> > >>   long.  
> > >
> > >  This is a GNU extension. POSIX states that L is only valid before
> > >a floating-point conversion specifier:
> > >
> > >L
> > >     Specifies that a following a, A, e, E, f, F, g, or G conversion 
> > >specifier
> > >     applies to an argument with type pointer to long double.
> > >
> > >  from 
> > >http://pubs.opengroup.org/onlinepubs/9699919799/functions/scanf.html
> > >
> > >  So, it is valid for musl not to accept %Lx.
> > >  Now, the argument that it's a good idea to align musl's behaviour to
> > >glibc's whenever possible is a sensible one. But it's a decision for
> > >the musl authors to make, and the pros and cons need to be carefully
> > >balanced; musl's current behaviour is not _incorrect_.  
> > 
> > It is incorrect, because scanf() has to return 0, or it has to handle the
> > L modifier. Currently it doesn't handle L and return 1, so the
> > application can't detect this issue.
> 
> That sounds like a bug in musl libc.
>  
> > I would prefer a case when musl works like glibc, if there are not any
> > reason to not to do that. For example,  now Alpine Linux is very popular
> > and there are a lot of packages. In many cases, a maintainer, who adds a
> > new package, fixes compile-time errors and doesn't run any tests.
> > A target application can work differently with musl comparing with glibc
> > due to this sort of issues.
> 
> FreeBSD man page says:
> 
>      L	      Indicates	that the conversion will be one	of a, e, f, or g and
> 	      the next pointer is a pointer to long double.
> 
> NetBSD man page says:
> 
>      L       Indicates that the conversion will be efg and the next pointer is
>              a pointer to long double.
> 
> OpenBSD man page says:
>      
> L
>     Indicates that the conversion will be one of efg and the next pointer is a pointer to long double.
> 
> So the application will break on most (every) system that is not GNU
> libc. It would be better to fix the application in this case:
> 
> 
>    char str[] = "sigmask: 0x200";
>    long long mask = 0;
>    int ret;
> 
> #if defined(__GLIBC__)
>    ret = sscanf(str, "sigmask: %Lx", &mask));
> #else
>    ret = sscanf(str, "sigmask: %llx", &mask));
> #endif
>    printf("%d %llx\n", ret, mask);
> 
> 
> 
> Or just use %llx which is POSIX and should work everywhere.

Indeed, there is no reason to use %Lx anywhere. It's simply wrong.

> That said, those things are tricky to detect at compile time as you
> mentioned and they are tricky to detect with configure scripts that
> works with cross compilation.

If gcc does not catch this with -Wformat, it's a gcc bug that we
should report and try to get fixed. It's possible that they're making
an exception for the invalidity of L with integer formats since some
libcs support that, but I don't see any good reason for this; gcc
should still be warning about the incorrect and nonportable usage. I
can't imagine they'd be opposed to a patch to fix it.

> Also many developers seems to think that
> Linux == glibc so they only read the GNU manuals, so yeah, implement
> glibc behavior here seems like a good idea, unless someone else has a
> brilliant idea how to catch this at compile time.

Aside from fixing gcc at compile time, this has come up before (with
regard to printf, not scanf), and my leaning then and now was to
detect the UB at runtime by crashing rather than reporting an error as
we do now, since (1) it's UB, so an application can't reasonably
expect an error, and (2) applications seem to be ignoring errors
anyway.

We should also get the man page fixed. The printf man page is clear
that L with integer specifiers is a nonstandard extension and should
not be used (they're not documented under L, only as a note at the
end) but it seems whoever fixed this overlooked changing scanf at the
same time.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.