Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <AS4PR83MB0546C4287E4459E2C0EFDD2BCB792@AS4PR83MB0546.EURPRD83.prod.outlook.com>
Date: Fri, 26 Jan 2024 17:13:13 +0000
From: Andy Caldwell <andycaldwell@...rosoft.com>
To: Rich Felker <dalias@...c.org>
CC: "musl@...ts.openwall.com" <musl@...ts.openwall.com>
Subject: RE: RE: [EXTERNAL] Re: [PATCH] fix avoidable segfault
 in catclose

> > > And it has been musl policy to crash on invalid args since the beginning.
> >
> > The current implementation doesn't (necessarily) crash/trap on an
> > invalid argument, instead it invokes (C-language spec-defined) UB
> > itself (it dereferences `(uint32_t*)((char*)cat) + 8)`, which, in the
> > case of the `-1` handle is the address 0x7, which in turn, not being a
> > valid address, is UB to dereference). If you're lucky (or are
> > compiling without optimizations/inlining) the compiler will emit a MOV
> > that will trigger an access violation and hence a SEGV, if
> 
> In general, it's impossible to test for "is this pointer valid?"
> 
> There are certain special cases we could test for, but unless there is a particularly
> convincing reason that they could lead to runaway wrong
> execution/vulnerabilities prior to naturally trapping, we have not considered
> littering the code with these kinds of checks to be a worthwhile trade-off.
>
> > you're unlucky the compiler will make wild assumptions about the value
> > of the variable passed as the arg (and for example in your first code
> > snippet, simply delete the `if` statement, meaning `use_cat` gets
> > called even when `catopen` fails potentially corrupting user
> > data/state).
> 
> I have no idea what you're talking about there. The compiler cannot make that
> kind of transformation (lifting code that could produce undefined behavior, side
> effects, etc. out of a conditional).

It's a hypothetical, but something like the following is valid for the compiler to do:

* inline the catclose (e.g. in LTO for a static link)
* consider the `if` statement and ask "what if `cat` is `-1`
* look forward to the pointer dereference (confirming that `cat` can't change in the interim)
* realise that `0x7` is not a valid pointer on the target platform so UB is inevitable if `cat` is `-1`
* optimize out the comparison since UB frees the compiler of any responsibilities

As an example of exactly this kind of UB-at-a-distance happening see https://lwn.net/Articles/342330/.  As compilers/optimizers get better the scope of fallout for UB is growing over time.

> > Crashing loudly (which requires _not_
> > invoking UB) on known bad inputs (a test against `-1` isn't exactly
> > expensive) feels like it meets the "musl policy" better than the
> > current code.
> 
> Letting the caller-directed UB "propagate through" to corresponding UB inside the
> implementation gives maximum debugging visibility of the root cause of the crash,
> and lets whoever's building link up their preferred form of instrumentation (e.g. -
> fsanitize=undefined).
> 
> Did you read the linked text
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsourcewa
> re.org%2Fglibc%2Fwiki%2FStyle_and_Conventions%23Bugs_in_the_user_progra
> m&data=05%7C02%7Candycaldwell%40microsoft.com%7Cb8a5e6447ca64d9484
> 2d08dc1dec2e01%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63841
> 8147445459351%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQI
> joiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=qxIHR
> sJ3nCRiYxTSdECipLeIxi9khnd7R5HOyr8RCvI%3D&reserved=0
> ?
> 
> Yes that is the glibc wiki, but I'm the original author of the text that was based on,
> which was in turn based on existing practice in musl. As written it's about NULL,
> but the same applies to (cat_t)-1, MAP_FAILED, and invalid pointers in general.

I did read that post (in fact that's what prompted my comment) - and I agree with its sentiment.  Unfortunately for that post's goals UB is non-local in the face of compiler optimizations making trapping reliably when functions are misused near impossible (though I acknowledge the point about `ubsan` which I'd not thought of).  If `libc` functions invoke UB then all bets are off, and it's near impossible for them to validate their arguments (e.g. "a non-freed pointer" is not detectable in any feasible way) in order to explicitly `abort` or similar.  Maybe the intended policy is to order the code of each function to "propagate" UB though as early as possible and hope that causes a fault that can be debugged (or that the user is using `ubsan`).

A

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.