|
Message-ID: <20201208225717.GK534@brightrain.aerifal.cx> Date: Tue, 8 Dec 2020 17:57:18 -0500 From: Rich Felker <dalias@...c.org> To: Brooks Davis <brooks@...ebsd.org> Cc: Brooks Davis <brooks@...-eyed-alien.net>, musl@...ts.openwall.com Subject: Re: out-of-bounds reads in strstr On Tue, Dec 08, 2020 at 10:44:54PM +0000, Brooks Davis wrote: > On Tue, Dec 08, 2020 at 02:53:28PM -0500, Rich Felker wrote: > > On Tue, Dec 08, 2020 at 07:39:19PM +0000, Brooks Davis wrote: > > > The strstr implementation contains the following snippet which results > > > in out-of-bounds reads in memchr (we detect them on CHERI because we > > > have byte-granularity bounds of small buffers): > > > > > > /* Fast estimate for MIN(l,63) */ > > > size_t grow = l | 63; > > > const unsigned char *z2 = memchr(z, 0, grow); > > > > > > The use of `|` means this is very much not an approximation of > > > `MIN(l,63)`. What is actually intended here? For CheriBSD (via FreeBSD) > > > I need a way to avoid out-of-bounds reads entirely (`MIN(l,63)` does seem > > > to work in simple system-level testing, but given the mismatch it's > > > unclear that's what was intended). > > > > There is no OOB read in strstr here. The overread is in the > > implementation of memchr, which (if you're using the musl version) is > > relying on the ability to overread as long as the address is aligned > > (assuming protection is at boundaries larger than word size). If this > > is a problem, you should use a version of memchr that does not > > overread. Note that, per 7.24.5.1 The memchr function, ??2: > > > > "The implementation shall behave as if it reads the characters > > sequentially and stops as soon as a matching character is found." > > > > so strstr's use of memchr here is perfectly correct/valid without any > > assumption of implementation details, i.e. [at least this part of] > > strstr is portable C. > > Ok, I see that. I'll adjust our (musl-derived) memchr to work > correctly in this case. > > That being said. I'm still confused by the comment in strstr. `l | 63` > is closer to `MAX(l,63)` than `MIN(l,63)`. Yes, the comment is wrong. The point is just to scan at least l bytes forward for the end of the haystack (since we'll need that many immediately) and at least some decent minimum to avoid doing it over and over if the needle is short. But there's no need for it to be precise. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.