Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240205172345.GA27288@openwall.com>
Date: Mon, 5 Feb 2024 18:23:45 +0100
From: Solar Designer <solar@...nwall.com>
To: Qualys Security Advisory <qsa@...lys.com>
Cc: oss-security@...ts.openwall.com,
	Adhemerval Zanella <adhemerval.zanella@...aro.org>
Subject: Re: Out-of-bounds read & write in the glibc's qsort()

On Mon, Feb 05, 2024 at 03:56:41PM +0000, Qualys Security Advisory wrote:
> On Sun, Feb 04, 2024 at 05:35:20PM +0100, Solar Designer wrote:
> > It's so invasive I cannot easily tell whether qsort() remained robust
> > after it or not.  There's no longer a "tmp_ptr != base_ptr &&" check.
> > So, lacking known-working tests in glibc tree, we don't know about glibc
> > 2.39's status with respect to this issue.
> 
> The "tmp_ptr != base_ptr" bounds check was originally added to the
> _quicksort() function, but is not needed anymore in glibc 2.39 because
> the old fallback to quick sort (the _quicksort() function) has been
> completely removed and replaced by a fallback to heap sort.
> 
> Note, just in case: we have not reviewed the implementation of this new
> fallback to heap sort.

Oh, I should have spent a bit more time looking at the latest glibc
before posting.  I just did.  So it indeed did not reintroduce this same
issue.  That's great.

Regarding the tests, I now see that one of them explicitly calls
heapsort_r(), so it tests that fallback code in this way, however the
rest simply call qsort() or qsort_r(), so they only test non-fallback
code.  It'd improve code coverage of these tests if they first do what
they do now, and then repeat the same after setting RLIMIT_AS to 0.

On Mon, Feb 05, 2024 at 05:02:52PM +0800, Alexander E. Patrakov wrote:
> On Mon, Feb 5, 2024 at 4:45???PM Alexander E. Patrakov <patrakov@...il.com> wrote:
> > On Mon, Feb 5, 2024 at 4:40???PM Alexander E. Patrakov <patrakov@...il.com> wrote:
> > > On Mon, Feb 5, 2024 at 12:36???AM Solar Designer <solar@...nwall.com> wrote:
> > > > I don't have a glibc 2.39 build handy.  Perhaps someone on a distro that
> > > > has already updated can run the attached test program and let us know?
> > >
> > > Here you go: no output on Arch Linux.
> > >
> > > [aep@...-haswell tmp]$ gcc ./glibc-qualys-rocky-qsort-test.c
> > > [aep@...-haswell tmp]$ ./a.out
> > > [aep@...-haswell tmp]$ /lib64/libc.so.6
> > > GNU C Library (GNU libc) stable release version 2.39.

> > Sorry, I should have followed the instructions.
> >
> > [aep@...-haswell tmp]$ while true; do n=$((RANDOM*64+RANDOM+1));
> > prlimit --as=$((n*4/2*3)) ./a.out $n; done
> >
> > This results in a mix of these outputs:
> >
> > PASSED
> > ./a.out: error while loading shared libraries: libc.so.6: failed to
> > map segment from shared object
> > Segmentation fault

> Upon investigation, I have to add: the segmentation faults come from
> code that runs before main(), so they do not indicate a problem in
> qsort().

Sorry, I should have included usage instructions.  It's like this:

gcc glibc-qualys-rocky-qsort-test.c -o glibc-qualys-rocky-qsort-test -O2
while true; do n=$((RANDOM*64+RANDOM+1)); echo $n; ./glibc-qualys-rocky-qsort-test $n; done

In other words, almost same as Qualys', but with prlimit omitted because
the program itself now takes care of it.  With our current patched glibc
in Rocky Linux SIG/Security, the output is like this:

396121
PASSED
77207
PASSED
683895
PASSED
1402983
PASSED

and so on.  No crashes anymore.  Before the one-line patch, it would hit
the test program's abort() within seconds, like Qualys had observed:

153916
PASSED
990497
PASSED
1501673
PASSED
1344354
PASSED
176197
PASSED
326004
Aborted (core dumped)
1892398
Aborted (core dumped)
834837
PASSED
2066676
PASSED
589237
Aborted (core dumped)

As to the occasional segfaults when you do use prlimit, I also saw them
on Rocky Linux 9.  They appeared to come from the kernel right after
execve() fails and kind of returns control back to prlimit.  I think
they're a symptom of execve() concluding it ran out of memory too late
for it to allow the original program to continue running.  As I recall
from patching this code in the kernel many years ago, such conditions
did and probably still do exist.  That's kind of fine.

Alexander

Powered by blists - more mailing lists

Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.