Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230529155929.GV4163@brightrain.aerifal.cx>
Date: Mon, 29 May 2023 11:59:29 -0400
From: Rich Felker <dalias@...c.org>
To: Jₑₙₛ Gustedt <jens.gustedt@...ia.fr>
Cc: musl@...ts.openwall.com
Subject: Re: changes for scanf in C23

On Mon, May 29, 2023 at 12:32:02PM +0200, Jₑₙₛ Gustedt wrote:
> Hi,
> we already discussed this but it doesn't seem that we have come to a
> conclusion.
> 
> The problem is that for C23 semantics of several string to integer
> conversion functions change: a 'b' or 'B' that previously was the stop
> condition for integer parsing may become part of the integer
> string. This concerns all `scanf` and `strto` derivatives.
> 
> This is probably not a problem for most applications that parse
> strings to integers, but it could be in some situations, and in
> particular it could open vulnerabilities. E.g network addresses that
> are read with base `0` (musl does this at some point to allow to have
> decimal or hex strings) could be open to attacks, once people start
> using binary encodings for integers more often. Another scenario where
> this could lead to harm is automatically produced output that is
> automatically scanned, and where nobody previously took care of proper
> word boundaries.
> 
> My current idea is to have two sets of these functions, one that has
> the old semantics and one that has the new.

This was rejected already in the first proposal (thread here):

Message-ID: <20230503000045.GU4163@...ghtrain.aerifal.cx>
https://www.openwall.com/lists/musl/2023/05/03/1

    "There are not going to be different versions of scanf/strto*
    because there's just no way to do that in a conforming way..."

There are other reasons for this too that basically amount to not
repeating glibc mistakes.

At some point I proposed a way that we could do C-version-specific
behavior via branching on an extern defined by linking in c23+ mode,
if this is really necessary. This probably needs more thought to flesh
out a design that's robust and has the right properties and make sure
we don't do anything that locks us into future trouble.

However, as I've said before, C users have survived multiple repeated
incompatible changes of this form, including the same thing happening
with hex floats. Moreover, strto* already are permitted to accept
arbitrary additional implementation-defined sequences except in the C
locale, so there's only any change at all in the C locale. My leaning,
if the committee is going to make these kinds of incompatible changes,
is to say that applications just have to be prepared to deal with them
and do any additional validation they deem necessary to their usage
cases.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.