|
Message-ID: <20230201180115.GB2626@voyager> Date: Wed, 1 Feb 2023 19:01:15 +0100 From: Markus Wichmann <nullplan@....net> To: musl@...ts.openwall.com Subject: Re: Re:Re: qsort On Mon, Jan 30, 2023 at 06:04:39PM +0800, David Wang wrote: > WIth glibc, average report is "qsort:3351271634 vs sort:4358412048" > But on alpine, it is "qsort:9585927992 vs sort:4354856330" > > On average, qsort with musl is significantly slower than c++ sort, > while with glibc the average performance of qsort is bettern than c++ > sort. > It seems to me as though smoothsort does not optimize for the number of comparisons. > Is there any story about the implementation of qsort in musl? I feel > it focused on performance improvement for some special kind of domain, > but not clear what it is. > Smoothsort has the desirable property of being adaptive. Its runtime gets closer to O(n) the more sorted the input already is. glibc uses mergesort or quicksort (the latter as fallback) and neither of them has that property. Plus, mergesort requires scratch storage and has a significantly harder time sorting arrays with large elements, because you end up constantly copying stuff. glibc tries to mitigate this by indirectly sorting once the elements go above 32 bytes in size. Basically, glibc is optimized for the comparisons, musl more for the number of swaps. Although we really shouldn't loose sight of the compares entirely, since those are indirect function calls, and current processors seem to dislike those. Anyway, there is an inefficiency in musl's smoothsort implementation, namely in the sift() function: | if(cmp(ar[0], lf, arg) >= 0 && cmp(ar[0], rt, arg) >= 0) { | break; | } | if(cmp(lf, rt, arg) >= 0) { | ar[i++] = lf; | head = lf; | pshift -= 1; | } else { | ar[i++] = rt; | head = rt; | pshift -= 2; | } As you can see, this does two or three comparisions, but actually needs to do only two in any case: If we detect ar[0] >= lf && ar[0] < rt, then by transitivity we have also detected rt > lf, so there is no need to compare lf and rt directly anymore. I have not really found a good way to formulate code that takes advantage of that, maybe something like this? if(cmp(ar[0], lf, arg) < 0) { if(cmp(lf, rt, arg) >= 0) { ar[i++] = lf; head = lf; pshift -= 1; } else { go_right: ar[i++] = rt; head = rt; pshift -= 2; } } else if (cmp(ar[0], rt, arg) < 0) { goto go_right; } else { break; } Yeah, not great. But it does use only two compares ever. This ought to lower the number of compares in your test somewhat. HTH, Markus
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.