|
Message-ID: <6658f4b9-7b5b-4c1b-8fc7-e9f2801fe3b0@catalyst.net.nz> Date: Tue, 25 Jun 2024 11:49:40 +1200 From: Douglas Bagnall <douglas.bagnall@...alyst.net.nz> To: oss-security@...ts.openwall.com, Qualys Security Advisory <qsa@...lys.com> Subject: Re: Out-of-bounds read & write in the glibc's qsort() [ For newer subscribers, I'll mention that this is in reply to https://www.openwall.com/lists/oss-security/2024/01/30/7 ] On 31/01/24 07:39, Qualys Security Advisory wrote: > We discovered a memory corruption in the glibc's qsort() function, due > to a missing bounds check. To be vulnerable, a program must call qsort() > with a nontransitive comparison function (a function cmp(int a, int b) > that returns (a - b), for example) and with a large number of attacker- > controlled elements (to cause a malloc() failure inside qsort()). We > have not tried to find such a vulnerable program in the real world. > > All glibc versions from at least September 1992 (glibc 1.04) to the > current release (glibc 2.38) are affected, but the glibc's developers > have independently discovered and patched this memory corruption in the > master branch (commit b9390ba, "stdlib: Fix array bounds protection in > insertion sort phase of qsort") during a recent refactoring of qsort(). In or before 2006 [1], the Samba project forked and modified a copy of glibc _quicksort() to create the qsort_r-like ldb_qsort(). In those days qsort_r() was rare (glibc got it in 2008), and Samba wanted comparison semantics to follow changes in the LDAP schema, for which the qsort_r context blob was thought useful. [1] https://gitlab.com/samba-team/samba/-/commit/de296b65136c3d7b6170db099b626d94fa190306 The code we copied was not the try-mergesort-first qsort() glibc uses (and used then) -- it was the quicksort-of-last-resort that Qualys reports on. I guess we were aiming for simplicity. At times we have discussed switching to qsort_r(), but that always founders because each libc does it slightly differently (and qsort_s does not solve every problem). So Samba was very susceptible to this bug. The good news is ldb_qsort() is not used in very many places, and some of those places already used transitive comparison functions. The other good news is that *now* it is patched with the "tmp_ptr > base_ptr &&" fix. And the third good news is that the comparison functions are fixed. That's in 4.19.7 and 4.20.2. We also had quite a large number of non-transitive comparisons using the system qsort(). These are vulnerable with glibc under memory pressure, as described by Qualys, who also note: > Remotely, forcing this malloc() failure is harder: either allocate a > large amount of memory (e.g., a memory leak) in the network service that > is being targeted, or in another network service on the same machine. Samba servers are quite complex, and it would be easy for the attacker to miss the right allocation and induce a different failure. There are almost certainly other more appealing and less noticeable attacks. Nevertheless we have been auditing and fixing all our compare functions. The lack of a clear attack helped us choose do that without an embargo shroud and the associated fuss. In any case, I fear there is greater danger for a project like Samba from a bad sort that *doesn't* crash the server (more on this later). The thing I found interesting, and which led me to write, is that we knew all the pieces of this, but never put them together. For a start, we were well aware that glibc qsort() uses mergesort by default, because (as the glibc people discovered when they tried to change it) we had inadvertently come to rely on it. This caused bugs on other platforms [2] and even led us to write our own stable sort for when we really need it. [2] https://lists.samba.org/archive/samba-technical/2018-March/126023.html We have also been fixing bugs in ldb_qsort() comparison functions without fully realising that we were dealing with a class of problem. For example, in the bug we called CVE-2019-14861 [3], we have comments like "This looks like a bug in ldb_qsort()" and "when I patch ldb_qsort() for qsort_r() the problem goes away". But when we find the problem in the compare function we say "I'm assuming that the issue is due to the unstable sort in dns_name_compare()", and sort of forget that ldb_qsort() is *also* the problem. [3] https://bugzilla.samba.org/show_bug.cgi?id=14138, fixed in https://gitlab.com/samba-team/samba/-/commit/defb23732515e3c638d0081f5e4043fbb35d303c The problem we fixed in dns_name_compare() was this: /* '@' record and the search_name record gets preference */ if (name1[0] == '@') { return -1; } if (search_name && strcasecmp(name1, search_name) == 0) { return -1; } which tries to unconditionally squish names starting with "@" and whatever search_name is to the start of the array (which is the problem end for these buggy quicksorts). When the array has both search_name and an @-name, or multiple @-names, the comparison was non-transitive. Git says I reviewed the CVE-2019-14861 patches, though I don't remember anything (I mean, CVSS 5.3, authenticated client can crash the server, who cares). But mistrust of ldb_qsort() lingered, and I saw what the Qualys report meant for it. The CVE-2019-14861 patch fixed the explicit attempt to push multiple values into one spot, but it left this in the same function: if (name1 == NULL || name2 == NULL) { return 0; } which is non-transitive, given we otherwise compare name1 and name2. For example, consider names {"a", NULL, "b"}. Then we have "a" < "b" "a" == NULL "b" == NULL meaning the order in which we compare things will affect how they end up. Perhaps this was never enough to go out of bounds, and perhaps there was never going to be a NULL in this list. In any case, now we go: if (name1 == name2) { /* this includes the both NULL case */ return 0; } if (name1 == NULL) { return -1; } if (name2 == NULL) { return 1; } which is transitive, sorting NULLs to the beginning of the list for easy discovery (we check in the next loop). I found a 2006 bug report [4] with the name "SEGV crash due to random libads/dns.c qsort() comparison function". [4] https://bugzilla.samba.org/show_bug.cgi?id=3959 fixed in https://gitlab.com/samba-team/samba/-/commit/18feaab9d556c4bd41a19823b3982e2bc30a2055 To quote from the bug: The qsort() used in libads/dns.c is regularly trashing the list of Domain Controllers. This is because the comparison function dnssrvcmp() is using rand(). AFAIK, that's not recommended -- repeated calls to the qsort() comparison function with the same args _must_ return the same, consistent results. [....] Hence this: int bad_compare(const void *l, const void *r) { return (rand() % 3) - 1; } is a terrible function to use, and causes real qsort()s to run off into the weeds. We weren't doing that, exactly, FWIW. It is good to be reminded that in 2006 it was normal for a qsort to "run off into the weeds". The problem for projects like Samba is it is easy for us to pick up little pieces of 2006 and carry them along with as we go. We also have several commits like this one from 2009 [5]: librpc: fixed the GUID_compare() function When comparing two unsigned values you can't just subtract them. Imagine you are comparing: "uint32_t u1" and "uint32_t u2". If you use "u1 - u2" and u2 is zero, then the signed integer result will depend on the top bit of u1. This error occurs in a few places in Samba. For DRS replication it resulted in corrupt uptodateness vectors. [5] https://gitlab.com/samba-team/samba/-/commit/a106fefcfb0cb60ce439884d8cd0c920d2fb193a What I am getting at is we have been all over this class of bug, but in a shamefully ad-hoc and folkloric way -- "you can't use rand()", "you can't subtract unsigned values", "you can't push two things into the same place", "you can't subtract" -- all of which are incomplete (and/or overreaching; you can safely subtract char, for example). Somehow we never got to the obvious position that all comparisons have to be transitive. Thanks to Qualys for the clarity. Thanks I guess also to C for being so unsafe and report-worthy. What worries me more about a bad sort in Samba AD is the ability to hide things without crashing. Attackers are likely to prefer a persistent unseen presence over bringing down a server process. Being in the wrong place in a sorted list might not seem like great hiding, but we do more complicated things with the sorted arrays, including binary search. The comparison in the binary search will visit a subset of values in a different order that the sort, so will likely step over or be thrown awry by the missorted element. That means the bad comparison could make something appear to vanish in some circumstances and not others. (I didn't look for actual instances of this, I just fixed the comparison functions). As I mentioned, ldb_qsort() and most comparison functions are fixed in 4.19.7 and 4.20.2 (both released earlier this month). One particularly complicated comparison function required an API change in the public libldb, so it will appear in 4.21 (probably September). That one isn't used with ldb_qsort(), so it is likely safe or safe-ish, depending on your libc. Douglas
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.