Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 26 Jan 2024 14:57:01 -0500
From: Rich Felker <dalias@...c.org>
To: Andy Caldwell <andycaldwell@...rosoft.com>
Cc: "musl@...ts.openwall.com" <musl@...ts.openwall.com>
Subject: Re: RE: [EXTERNAL] Re: [PATCH] fix avoidable segfault
 in catclose

On Fri, Jan 26, 2024 at 07:12:59PM +0000, Andy Caldwell wrote:
> > > > > > And it has been musl policy to crash on invalid args since the beginning.
> > > > >
> > > > > The current implementation doesn't (necessarily) crash/trap on an
> > > > > invalid argument, instead it invokes (C-language spec-defined) UB
> > > > > itself (it dereferences `(uint32_t*)((char*)cat) + 8)`, which, in
> > > > > the case of the `-1` handle is the address 0x7, which in turn, not
> > > > > being a valid address, is UB to dereference). If you're lucky (or
> > > > > are compiling without optimizations/inlining) the compiler will
> > > > > emit a MOV that will trigger an access violation and hence a SEGV,
> > > > > if
> > > >
> > > > In general, it's impossible to test for "is this pointer valid?"
> > > >
> > > > There are certain special cases we could test for, but unless there
> > > > is a particularly convincing reason that they could lead to runaway
> > > > wrong execution/vulnerabilities prior to naturally trapping, we have
> > > > not considered littering the code with these kinds of checks to be a
> > worthwhile trade-off.
> > > >
> > > > > you're unlucky the compiler will make wild assumptions about the
> > > > > value of the variable passed as the arg (and for example in your
> > > > > first code snippet, simply delete the `if` statement, meaning
> > > > > `use_cat` gets called even when `catopen` fails potentially
> > > > > corrupting user data/state).
> > > >
> > > > I have no idea what you're talking about there. The compiler cannot
> > > > make that kind of transformation (lifting code that could produce
> > > > undefined behavior, side effects, etc. out of a conditional).
> > >
> > > It's a hypothetical, but something like the following is valid for the compiler to
> > do:
> > >
> > > * inline the catclose (e.g. in LTO for a static link)
> > > * consider the `if` statement and ask "what if `cat` is `-1`
> > > * look forward to the pointer dereference (confirming that `cat` can't
> > > change in the interim)
> > > * realise that `0x7` is not a valid pointer on the target platform so
> > > UB is inevitable if `cat` is `-1`
> > > * optimize out the comparison since UB frees the compiler of any
> > > responsibilities
> > 
> > You have the logic backwards. In the case where cat==(cat_t)-1, catclose is not
> > called on the abstract machine, so no conclusions can be drawn from anything
> > catclose would do.
> 
> The original code I was working from was:
> 
> ```
> nl_catd cat = catopen(...);
> if (cat != (nl_catd)-1) {
>     use_cat(cat);
> }
> catclose(cat);
> ```
> 
> (i.e. an incorrect use of the APIs, but not UB in a "C99 spec"
> sense). In that code the `catclose` call is provably inevitable,
> allowing the compiler to infer properties of `cat` from it.

Ah, okay, at least now that makes sense. But indeed it is undefined:

  "Each of the following statements shall apply to all functions
   unless explicitly stated otherwise in the detailed descriptions
   that follow:

   1. If an argument to a function has an invalid value (such as a
      value outside the domain of the function, or a pointer outside
      the address space of the program, or a null pointer), the
      behavior is undefined.

   ..."

https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_01

So I guess what you're saying is that, in the case where an erroneous
program like the above has undefined behavior, the compiler could make
a transformation such that the effect of the UB is seen at a point
different from where it logically occurs. (This is the norm for UB.)
In particular, despite cat being -1 from a failed catopen, you might
see use_cat being called with a seemingly impossible argument.

Exacerbating the degree to which UB can become non-localized is one of
the expected effects of LTO, and arguably a good reason not to use LTO
for debugging. I don't see a lot of value in trying to prevent this in
isolated cases when it's going to happen all over the place anyway for
other reasons.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.