musl - Re: Word-sized reads access memory past the bound of objects

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130430154045.GF20323@brightrain.aerifal.cx>
Date: Tue, 30 Apr 2013 11:40:45 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: Word-sized reads access memory past the bound of objects

On Tue, Apr 30, 2013 at 05:11:14PM +0200, Jonas Wagner wrote:
> Hi,
> 
> I'm currently experimenting with MUSL and automated bug finding tools. One
> issue I'm facing is that the tool reports several errors in functions such
> as strlen, that perform word-size accesses. What happens is that strlen
> reads a word at a time, then checks whether there is a zero in there. If
> the zero happens to be in the first byte, it thus reads three bytes past
> the end of the string.
> 
> In principle, the tool is correct and MUSL does cause undefined behavior

Yes and no. The "underlying freestanding implementation" musl assumes
and is built on has a representation arrays for all of mapped memory
in page-size units with mapping properties/permissions on page
granularity. However, testing and analysis tools might offer a more
restrictive underlying model.

> here. In practice, I don't see a way how MUSL's behavior could cause any
> damage...

Read-only accesses aligned to the size of the access, and where the
initial byte is accessible, can never fault under the assumed memory
model.

> My questions are:
> - How prevalent is such code in MUSL?

Not very. Probably src/string and src/multibyte are the only places.

> - Would there be an easy way to find all these places and change them?

The tool you're using is probably the best way. Or, any static
analysis that can detect conversions (even indirect) from character
pointer types to a pointer to a non-character type.

> - Are there other types of "soft" undefined behavior that MUSL exploits?

I don't think so. The closest things I can think of:

- UTF-8 code depends on sign-extending right-shift. This could be
  easily fixed if it can be verified that the standard trick to work
  around it generates the same (or equally efficient) code. Note this
  is implementation-defined, not undefined.

- Floating point conversion to/from strings depends on IEEE arithmetic
  properties and on long double being an IEEE conforming type. (x87
  ld80 is fine, so is IEEE quad, but IBM double-double will not work,
  and systems that typically use IBM double-double should instead have
  their compiler configured for 64-bit long double instead.)

- calloc assumes its own implementation of malloc. Compilers and
  analysis tools which assume negative offsets from the pointer
  returned by malloc are invalid will falsely detect problems and/or
  miscompile calloc.c. This issue affected old versions of clang.

- The dynamic linker also makes some assumptions about the
  implementation of malloc and passes pointers not obtained by malloc
  to free, as part of its mechanism to reclaim wasted slack space in
  shared libraries due to page alignment.

- POSIX timers with SIGEV_THREAD perform a longjmp out of a
  cancellation handler to intercept cancellation/exit so the same
  physical thread can be kept to handle the next timer expiration. For
  an application to do this would be UB (at the POSIX level, not the C
  level) but since they're both part of the same implementation they
  can assume things about each other.

That's all that comes to mind right now. Thanks for bringing up this
question, because it's something that should be documented in case
people want to reuse parts of musl in contexts where some of the
assumptions may no longer be valid.

> I guess doing changing MUSL would lose a lot of performance... so maybe
> I'll adapt the bug finding tool instead...

Maybe. With a compiler that can do vectorization and a machine with
vector instructions, the "naive" versions of these functions can be
just as fast in practice, and perhaps even faster in theory. The big
problem is that gcc won't vectorize 4 byte accesses into a 32-bit word
in a normal 32-bit register, even though it could... Maybe in the long
term this won't matter if we have asm for the important archs without
vector ops...?

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.