|
Message-ID: <20240126195701.GO4163@brightrain.aerifal.cx> Date: Fri, 26 Jan 2024 14:57:01 -0500 From: Rich Felker <dalias@...c.org> To: Andy Caldwell <andycaldwell@...rosoft.com> Cc: "musl@...ts.openwall.com" <musl@...ts.openwall.com> Subject: Re: RE: [EXTERNAL] Re: [PATCH] fix avoidable segfault in catclose On Fri, Jan 26, 2024 at 07:12:59PM +0000, Andy Caldwell wrote: > > > > > > And it has been musl policy to crash on invalid args since the beginning. > > > > > > > > > > The current implementation doesn't (necessarily) crash/trap on an > > > > > invalid argument, instead it invokes (C-language spec-defined) UB > > > > > itself (it dereferences `(uint32_t*)((char*)cat) + 8)`, which, in > > > > > the case of the `-1` handle is the address 0x7, which in turn, not > > > > > being a valid address, is UB to dereference). If you're lucky (or > > > > > are compiling without optimizations/inlining) the compiler will > > > > > emit a MOV that will trigger an access violation and hence a SEGV, > > > > > if > > > > > > > > In general, it's impossible to test for "is this pointer valid?" > > > > > > > > There are certain special cases we could test for, but unless there > > > > is a particularly convincing reason that they could lead to runaway > > > > wrong execution/vulnerabilities prior to naturally trapping, we have > > > > not considered littering the code with these kinds of checks to be a > > worthwhile trade-off. > > > > > > > > > you're unlucky the compiler will make wild assumptions about the > > > > > value of the variable passed as the arg (and for example in your > > > > > first code snippet, simply delete the `if` statement, meaning > > > > > `use_cat` gets called even when `catopen` fails potentially > > > > > corrupting user data/state). > > > > > > > > I have no idea what you're talking about there. The compiler cannot > > > > make that kind of transformation (lifting code that could produce > > > > undefined behavior, side effects, etc. out of a conditional). > > > > > > It's a hypothetical, but something like the following is valid for the compiler to > > do: > > > > > > * inline the catclose (e.g. in LTO for a static link) > > > * consider the `if` statement and ask "what if `cat` is `-1` > > > * look forward to the pointer dereference (confirming that `cat` can't > > > change in the interim) > > > * realise that `0x7` is not a valid pointer on the target platform so > > > UB is inevitable if `cat` is `-1` > > > * optimize out the comparison since UB frees the compiler of any > > > responsibilities > > > > You have the logic backwards. In the case where cat==(cat_t)-1, catclose is not > > called on the abstract machine, so no conclusions can be drawn from anything > > catclose would do. > > The original code I was working from was: > > ``` > nl_catd cat = catopen(...); > if (cat != (nl_catd)-1) { > use_cat(cat); > } > catclose(cat); > ``` > > (i.e. an incorrect use of the APIs, but not UB in a "C99 spec" > sense). In that code the `catclose` call is provably inevitable, > allowing the compiler to infer properties of `cat` from it. Ah, okay, at least now that makes sense. But indeed it is undefined: "Each of the following statements shall apply to all functions unless explicitly stated otherwise in the detailed descriptions that follow: 1. If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer), the behavior is undefined. ..." https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_01 So I guess what you're saying is that, in the case where an erroneous program like the above has undefined behavior, the compiler could make a transformation such that the effect of the UB is seen at a point different from where it logically occurs. (This is the norm for UB.) In particular, despite cat being -1 from a failed catopen, you might see use_cat being called with a seemingly impossible argument. Exacerbating the degree to which UB can become non-localized is one of the expected effects of LTO, and arguably a good reason not to use LTO for debugging. I don't see a lot of value in trying to prevent this in isolated cases when it's going to happen all over the place anyway for other reasons. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.