Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20210309034401.22201-1-ericonr@disroot.org>
Date: Tue,  9 Mar 2021 00:44:01 -0300
From: Érico Nogueira <ericonr@...root.org>
To: musl@...ts.openwall.com
Cc: Érico Nogueira <ericonr@...root.org>
Subject: [PATCH] add qsort_r.

since most discussion around the addition of this function has centered
around the possible code duplication it requires or that qsort would
become much slower if implemented as a wrapper around qsort_r, this
commit uses an impl file that's included with the appropriate
definitions in order to generate qsort_r or the old qsort. this avoids
code duplication and the efficiency loss, at the cost of greater binary
size - size(1) reports 602282 vs 600673.

this commit requires that the extra argument used by qsort_r always be
called 'arg', and it should be the first argument of the function, which
allows the usage of variadic macros to call the implementation internal
functions, instead of requiring the addition of multiple macros with
hard to read definitions. it also requires the comparison function
always be called 'cmp'.

some of the other functions which don't deal with 'cmp' or 'arg' can
probably be factored out into a separate file, allowing for possibly
smaller code size, but that hasn't been tried yet.
---

The commit diff is, however, rather large.

Unfortunately, I'm not sure the status around qsort_r has changed much.
The points raised in [1] are mostly still valid and the linked bugs are
still open, including Leah's summary in [2]. The FreeBSD bug [3] seems
to be stalled since the end of 2018.

The Austin Group bug [4] mentions another implementation possibility,
where qsort would become a qsort_r wrapper:

	void qsort(void *base, size_t nel, size_t width,
				  int (*compar)(const void *, const void *))
	{
	  qsort_r(base, nel, width,
				 (int (*)(const void*, const void*, void*))compar, NULL);
	}

but it would be undefined, per Rich's comment in [5].

[1] https://inbox.vuxu.org/musl/20180903205705.GA7639@localhost/
[2] https://inbox.vuxu.org/musl/87lg8h8dw1.fsf@vuxu.org/
[3] https://reviews.freebsd.org/D17083
[4] https://www.austingroupbugs.net/view.php?id=900
[5] https://inbox.vuxu.org/musl/20181216015704.GD23599@brightrain.aerifal.cx/

Despite all these negatives, my hope is that musl implementing the
function with the glibc signature (which seems to be the agreed upon way
forward) might move the bureacracy some. For now, musl distros have
managed to deal with the lack of qsort_r reasonably fine, but I don't
know what the silent cost (as in software that couldn't be easily
packaged or used and no one complained) has been. For the "louder" cost
that I know of:

- util-linux loses some small functionality in libfdisk if qsort_r isn't
  available
- elfutils required a patch that removed qsort_r usage up until very
  recently, and might in the future come to require it again, if the
  place where it's used is made threaded (for better or for worse, using
  qsort_r is prettier than using a __thread global :)

 include/stdlib.h        |   1 +
 src/stdlib/qsort.c      | 219 +------------------------------------
 src/stdlib/qsort_impl.h | 232 ++++++++++++++++++++++++++++++++++++++++
 src/stdlib/qsort_r.c    |   4 +
 4 files changed, 239 insertions(+), 217 deletions(-)
 create mode 100644 src/stdlib/qsort_impl.h
 create mode 100644 src/stdlib/qsort_r.c

diff --git a/include/stdlib.h b/include/stdlib.h
index b54a051f..0c0ced5f 100644
--- a/include/stdlib.h
+++ b/include/stdlib.h
@@ -158,6 +158,7 @@ struct __locale_struct;
 float strtof_l(const char *__restrict, char **__restrict, struct __locale_struct *);
 double strtod_l(const char *__restrict, char **__restrict, struct __locale_struct *);
 long double strtold_l(const char *__restrict, char **__restrict, struct __locale_struct *);
+void qsort_r (void *, size_t, size_t, int (*)(const void *, const void *, void *), void *);
 #endif
 
 #if defined(_LARGEFILE64_SOURCE) || defined(_GNU_SOURCE)
diff --git a/src/stdlib/qsort.c b/src/stdlib/qsort.c
index da58fd31..2f9e024f 100644
--- a/src/stdlib/qsort.c
+++ b/src/stdlib/qsort.c
@@ -1,218 +1,3 @@
-/* Copyright (C) 2011 by Valentin Ochs
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-
-/* Minor changes by Rich Felker for integration in musl, 2011-04-27. */
-
-/* Smoothsort, an adaptive variant of Heapsort.  Memory usage: O(1).
-   Run time: Worst case O(n log n), close to O(n) in the mostly-sorted case. */
-
-#include <stdint.h>
-#include <stdlib.h>
-#include <string.h>
-
-#include "atomic.h"
-#define ntz(x) a_ctz_l((x))
-
 typedef int (*cmpfun)(const void *, const void *);
-
-static inline int pntz(size_t p[2]) {
-	int r = ntz(p[0] - 1);
-	if(r != 0 || (r = 8*sizeof(size_t) + ntz(p[1])) != 8*sizeof(size_t)) {
-		return r;
-	}
-	return 0;
-}
-
-static void cycle(size_t width, unsigned char* ar[], int n)
-{
-	unsigned char tmp[256];
-	size_t l;
-	int i;
-
-	if(n < 2) {
-		return;
-	}
-
-	ar[n] = tmp;
-	while(width) {
-		l = sizeof(tmp) < width ? sizeof(tmp) : width;
-		memcpy(ar[n], ar[0], l);
-		for(i = 0; i < n; i++) {
-			memcpy(ar[i], ar[i + 1], l);
-			ar[i] += l;
-		}
-		width -= l;
-	}
-}
-
-/* shl() and shr() need n > 0 */
-static inline void shl(size_t p[2], int n)
-{
-	if(n >= 8 * sizeof(size_t)) {
-		n -= 8 * sizeof(size_t);
-		p[1] = p[0];
-		p[0] = 0;
-	}
-	p[1] <<= n;
-	p[1] |= p[0] >> (sizeof(size_t) * 8 - n);
-	p[0] <<= n;
-}
-
-static inline void shr(size_t p[2], int n)
-{
-	if(n >= 8 * sizeof(size_t)) {
-		n -= 8 * sizeof(size_t);
-		p[0] = p[1];
-		p[1] = 0;
-	}
-	p[0] >>= n;
-	p[0] |= p[1] << (sizeof(size_t) * 8 - n);
-	p[1] >>= n;
-}
-
-static void sift(unsigned char *head, size_t width, cmpfun cmp, int pshift, size_t lp[])
-{
-	unsigned char *rt, *lf;
-	unsigned char *ar[14 * sizeof(size_t) + 1];
-	int i = 1;
-
-	ar[0] = head;
-	while(pshift > 1) {
-		rt = head - width;
-		lf = head - width - lp[pshift - 2];
-
-		if((*cmp)(ar[0], lf) >= 0 && (*cmp)(ar[0], rt) >= 0) {
-			break;
-		}
-		if((*cmp)(lf, rt) >= 0) {
-			ar[i++] = lf;
-			head = lf;
-			pshift -= 1;
-		} else {
-			ar[i++] = rt;
-			head = rt;
-			pshift -= 2;
-		}
-	}
-	cycle(width, ar, i);
-}
-
-static void trinkle(unsigned char *head, size_t width, cmpfun cmp, size_t pp[2], int pshift, int trusty, size_t lp[])
-{
-	unsigned char *stepson,
-	              *rt, *lf;
-	size_t p[2];
-	unsigned char *ar[14 * sizeof(size_t) + 1];
-	int i = 1;
-	int trail;
-
-	p[0] = pp[0];
-	p[1] = pp[1];
-
-	ar[0] = head;
-	while(p[0] != 1 || p[1] != 0) {
-		stepson = head - lp[pshift];
-		if((*cmp)(stepson, ar[0]) <= 0) {
-			break;
-		}
-		if(!trusty && pshift > 1) {
-			rt = head - width;
-			lf = head - width - lp[pshift - 2];
-			if((*cmp)(rt, stepson) >= 0 || (*cmp)(lf, stepson) >= 0) {
-				break;
-			}
-		}
-
-		ar[i++] = stepson;
-		head = stepson;
-		trail = pntz(p);
-		shr(p, trail);
-		pshift += trail;
-		trusty = 0;
-	}
-	if(!trusty) {
-		cycle(width, ar, i);
-		sift(head, width, cmp, pshift, lp);
-	}
-}
-
-void qsort(void *base, size_t nel, size_t width, cmpfun cmp)
-{
-	size_t lp[12*sizeof(size_t)];
-	size_t i, size = width * nel;
-	unsigned char *head, *high;
-	size_t p[2] = {1, 0};
-	int pshift = 1;
-	int trail;
-
-	if (!size) return;
-
-	head = base;
-	high = head + size - width;
-
-	/* Precompute Leonardo numbers, scaled by element width */
-	for(lp[0]=lp[1]=width, i=2; (lp[i]=lp[i-2]+lp[i-1]+width) < size; i++);
-
-	while(head < high) {
-		if((p[0] & 3) == 3) {
-			sift(head, width, cmp, pshift, lp);
-			shr(p, 2);
-			pshift += 2;
-		} else {
-			if(lp[pshift - 1] >= high - head) {
-				trinkle(head, width, cmp, p, pshift, 0, lp);
-			} else {
-				sift(head, width, cmp, pshift, lp);
-			}
-			
-			if(pshift == 1) {
-				shl(p, 1);
-				pshift = 0;
-			} else {
-				shl(p, pshift - 1);
-				pshift = 1;
-			}
-		}
-		
-		p[0] |= 1;
-		head += width;
-	}
-
-	trinkle(head, width, cmp, p, pshift, 0, lp);
-
-	while(pshift != 1 || p[0] != 1 || p[1] != 0) {
-		if(pshift <= 1) {
-			trail = pntz(p);
-			shr(p, trail);
-			pshift += trail;
-		} else {
-			shl(p, 2);
-			pshift -= 2;
-			p[0] ^= 7;
-			shr(p, 1);
-			trinkle(head - lp[pshift] - width, width, cmp, p, pshift + 1, 1, lp);
-			shl(p, 1);
-			p[0] |= 1;
-			trinkle(head - width, width, cmp, p, pshift, 1, lp);
-		}
-		head -= width;
-	}
-}
+#define call_cmp(v1,v2) ((*cmp)(v1,v2))
+#include "qsort_impl.h"
diff --git a/src/stdlib/qsort_impl.h b/src/stdlib/qsort_impl.h
new file mode 100644
index 00000000..c8d76e50
--- /dev/null
+++ b/src/stdlib/qsort_impl.h
@@ -0,0 +1,232 @@
+/* Copyright (C) 2011 by Valentin Ochs
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/* Minor changes by Rich Felker for integration in musl, 2011-04-27. */
+
+/* Smoothsort, an adaptive variant of Heapsort.  Memory usage: O(1).
+   Run time: Worst case O(n log n), close to O(n) in the mostly-sorted case. */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "atomic.h"
+#define ntz(x) a_ctz_l((x))
+
+static inline int pntz(size_t p[2]) {
+	int r = ntz(p[0] - 1);
+	if(r != 0 || (r = 8*sizeof(size_t) + ntz(p[1])) != 8*sizeof(size_t)) {
+		return r;
+	}
+	return 0;
+}
+
+static void cycle(size_t width, unsigned char* ar[], int n)
+{
+	unsigned char tmp[256];
+	size_t l;
+	int i;
+
+	if(n < 2) {
+		return;
+	}
+
+	ar[n] = tmp;
+	while(width) {
+		l = sizeof(tmp) < width ? sizeof(tmp) : width;
+		memcpy(ar[n], ar[0], l);
+		for(i = 0; i < n; i++) {
+			memcpy(ar[i], ar[i + 1], l);
+			ar[i] += l;
+		}
+		width -= l;
+	}
+}
+
+/* shl() and shr() need n > 0 */
+static inline void shl(size_t p[2], int n)
+{
+	if(n >= 8 * sizeof(size_t)) {
+		n -= 8 * sizeof(size_t);
+		p[1] = p[0];
+		p[0] = 0;
+	}
+	p[1] <<= n;
+	p[1] |= p[0] >> (sizeof(size_t) * 8 - n);
+	p[0] <<= n;
+}
+
+static inline void shr(size_t p[2], int n)
+{
+	if(n >= 8 * sizeof(size_t)) {
+		n -= 8 * sizeof(size_t);
+		p[0] = p[1];
+		p[1] = 0;
+	}
+	p[0] >>= n;
+	p[0] |= p[1] << (sizeof(size_t) * 8 - n);
+	p[1] >>= n;
+}
+
+#ifdef QSORT_R
+# define sift(...) sift_impl(arg, __VA_ARGS__)
+static void sift_impl(void *arg, unsigned char *head, size_t width, cmpfun cmp, int pshift, size_t lp[])
+#else
+# define sift sift_impl
+static void sift_impl(unsigned char *head, size_t width, cmpfun cmp, int pshift, size_t lp[])
+#endif
+{
+	unsigned char *rt, *lf;
+	unsigned char *ar[14 * sizeof(size_t) + 1];
+	int i = 1;
+
+	ar[0] = head;
+	while(pshift > 1) {
+		rt = head - width;
+		lf = head - width - lp[pshift - 2];
+
+		if(call_cmp(ar[0], lf) >= 0 && call_cmp(ar[0], rt) >= 0) {
+			break;
+		}
+		if(call_cmp(lf, rt) >= 0) {
+			ar[i++] = lf;
+			head = lf;
+			pshift -= 1;
+		} else {
+			ar[i++] = rt;
+			head = rt;
+			pshift -= 2;
+		}
+	}
+	cycle(width, ar, i);
+}
+
+#ifdef QSORT_R
+# define trinkle(...) trinkle_impl(arg, __VA_ARGS__)
+static void trinkle_impl(void *arg, unsigned char *head, size_t width, cmpfun cmp, size_t pp[2], int pshift, int trusty, size_t lp[])
+#else
+# define trinkle trinkle_impl
+static void trinkle_impl(unsigned char *head, size_t width, cmpfun cmp, size_t pp[2], int pshift, int trusty, size_t lp[])
+#endif
+{
+	unsigned char *stepson,
+	              *rt, *lf;
+	size_t p[2];
+	unsigned char *ar[14 * sizeof(size_t) + 1];
+	int i = 1;
+	int trail;
+
+	p[0] = pp[0];
+	p[1] = pp[1];
+
+	ar[0] = head;
+	while(p[0] != 1 || p[1] != 0) {
+		stepson = head - lp[pshift];
+		if(call_cmp(stepson, ar[0]) <= 0) {
+			break;
+		}
+		if(!trusty && pshift > 1) {
+			rt = head - width;
+			lf = head - width - lp[pshift - 2];
+			if(call_cmp(rt, stepson) >= 0 || call_cmp(lf, stepson) >= 0) {
+				break;
+			}
+		}
+
+		ar[i++] = stepson;
+		head = stepson;
+		trail = pntz(p);
+		shr(p, trail);
+		pshift += trail;
+		trusty = 0;
+	}
+	if(!trusty) {
+		cycle(width, ar, i);
+		sift(head, width, cmp, pshift, lp);
+	}
+}
+
+#ifdef QSORT_R
+void qsort_r(void *base, size_t nel, size_t width, cmpfun cmp, void *arg)
+#else
+void qsort(void *base, size_t nel, size_t width, cmpfun cmp)
+#endif
+{
+	size_t lp[12*sizeof(size_t)];
+	size_t i, size = width * nel;
+	unsigned char *head, *high;
+	size_t p[2] = {1, 0};
+	int pshift = 1;
+	int trail;
+
+	if (!size) return;
+
+	head = base;
+	high = head + size - width;
+
+	/* Precompute Leonardo numbers, scaled by element width */
+	for(lp[0]=lp[1]=width, i=2; (lp[i]=lp[i-2]+lp[i-1]+width) < size; i++);
+
+	while(head < high) {
+		if((p[0] & 3) == 3) {
+			sift(head, width, cmp, pshift, lp);
+			shr(p, 2);
+			pshift += 2;
+		} else {
+			if(lp[pshift - 1] >= high - head) {
+				trinkle(head, width, cmp, p, pshift, 0, lp);
+			} else {
+				sift(head, width, cmp, pshift, lp);
+			}
+
+			if(pshift == 1) {
+				shl(p, 1);
+				pshift = 0;
+			} else {
+				shl(p, pshift - 1);
+				pshift = 1;
+			}
+		}
+
+		p[0] |= 1;
+		head += width;
+	}
+
+	trinkle(head, width, cmp, p, pshift, 0, lp);
+
+	while(pshift != 1 || p[0] != 1 || p[1] != 0) {
+		if(pshift <= 1) {
+			trail = pntz(p);
+			shr(p, trail);
+			pshift += trail;
+		} else {
+			shl(p, 2);
+			pshift -= 2;
+			p[0] ^= 7;
+			shr(p, 1);
+			trinkle(head - lp[pshift] - width, width, cmp, p, pshift + 1, 1, lp);
+			shl(p, 1);
+			p[0] |= 1;
+			trinkle(head - width, width, cmp, p, pshift, 1, lp);
+		}
+		head -= width;
+	}
+}
diff --git a/src/stdlib/qsort_r.c b/src/stdlib/qsort_r.c
new file mode 100644
index 00000000..e0500dcb
--- /dev/null
+++ b/src/stdlib/qsort_r.c
@@ -0,0 +1,4 @@
+typedef int (*cmpfun)(const void *, const void *, void *);
+#define call_cmp(v1,v2) ((*cmp)(v1,v2,arg))
+#define QSORT_R
+#include "qsort_impl.h"
-- 
2.30.1

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.