|
|
Message-ID: <20180920012411.GZ17995@brightrain.aerifal.cx>
Date: Wed, 19 Sep 2018 21:24:11 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] new tsearch implementation
On Wed, Sep 19, 2018 at 02:02:47AM +0200, Szabolcs Nagy wrote:
> new code that's a bit faster and smaller.
>
> with simple benchmark of randomly adding/deleting nodes
> new code is about 1.5s vs old 2s on my laptop (many
> tsearch/tdelete operations on a tree size around 1000
> nodes and depth around 10).
> From 57e62eb255d835ba6801f704c4c1aabd28d34804 Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy <nsz@...t70.net>
> Date: Sun, 21 Aug 2016 20:06:56 +0000
> Subject: [PATCH] new tsearch implementation
>
> Rewrote the AVL tree implementation:
>
> - It is now non-recursive with fixed stack usage (large enough for
> worst case tree height). twalk and tdestroy are still recursive as
> that's smaller/simpler.
>
> - Moved unrelated interfaces into separate translation units.
>
> - The node structure is changed to use indexed children instead of
> left/right pointers, this simplifies the balancing logic.
>
> - Using void * pointers instead of struct node * in various places,
> because this better fits the api (node address is passed in a void**
> argument, so it is tempting to incorrectly cast it to struct node **).
>
> - As a further performance improvement the rebalancing now stops
> when it is not needed (subtree height is unchanged). Otherwise
> the behaviour should be the same as before (checked over generated
> random inputs that the resulting tree shape is equivalent).
>
> - Removed the old copyright notice (including prng related one: it
> should be licensed under the same terms as the rest of the project).
>
> .text size of pic tsearch + tfind + tdelete + twalk:
>
> x86_64 i386 aarch64 arm mips powerpc ppc64le sh4 m68k s390x
> old 941 899 1220 1068 1852 1400 1600 1008 1008 1488
> new 857 881 1040 976 1564 1192 1360 736 820 1408
Nice! Given that you seem to have done heavy testing I didn't read the
code with a lot of scrutiny, but it all looks good from a casual
reading.
> ---
> COPYRIGHT | 6 --
> src/search/tdelete.c | 49 ++++++++++
> src/search/tdestroy.c | 13 +--
> src/search/tfind.c | 20 ++++
> src/search/tsearch.c | 88 +++++++++++++++++
> src/search/tsearch.h | 13 +++
> src/search/tsearch_avl.c | 204 ---------------------------------------
> src/search/twalk.c | 22 +++++
> 8 files changed, 196 insertions(+), 219 deletions(-)
> create mode 100644 src/search/tdelete.c
> create mode 100644 src/search/tfind.c
> create mode 100644 src/search/tsearch.c
> create mode 100644 src/search/tsearch.h
> delete mode 100644 src/search/tsearch_avl.c
> create mode 100644 src/search/twalk.c
>
> diff --git a/COPYRIGHT b/COPYRIGHT
> index 8c3a6d19..b8a76045 100644
> --- a/COPYRIGHT
> +++ b/COPYRIGHT
> @@ -126,12 +126,6 @@ in jurisdictions that may not recognize the public domain.
> The smoothsort implementation (src/stdlib/qsort.c) is Copyright © 2011
> Valentin Ochs and is licensed under an MIT-style license.
>
> -The BSD PRNG implementation (src/prng/random.c) and XSI search API
> -(src/search/*.c) functions are Copyright © 2011 Szabolcs Nagy and
> -licensed under following terms: "Permission to use, copy, modify,
> -and/or distribute this code for any purpose with or without fee is
> -hereby granted. There is no warranty."
> -
> The x86_64 port was written by Nicholas J. Kain and is licensed under
> the standard MIT terms.
This part fails to apply because your MUA reencoded the attachment to
latin1. Not a big problem (piping thru iconv -f latin1 fixed it) but
it's something you might want to be aware of. Ideally mutt should have
a way to tell it not to reencode attachments, ever, but this generally
works to avoid the problem assuming your data is UTF-8:
set send_charset="us-ascii:utf-8"
> diff --git a/src/search/tsearch.c b/src/search/tsearch.c
> new file mode 100644
> index 00000000..54f9f699
> --- /dev/null
> +++ b/src/search/tsearch.c
> @@ -0,0 +1,88 @@
> +#include <stdlib.h>
> +#include <search.h>
> +#include "tsearch.h"
> +
> +static inline int height(struct node *n) { return n ? n->h : 0; }
> +
> +static int rot(void **p, struct node *x, int dir /* deeper side */)
> +{
> + struct node *y = x->a[dir];
> + struct node *z = y->a[!dir];
> + int hx = x->h;
> + int hz = height(z);
> + if (hz > height(y->a[dir])) {
> + // x
> + // / \ dir z
> + // A y / \
> + // / \ --> x y
> + // z D /| |\
> + // / \ A B C D
> + // B C
> + x->a[dir] = z->a[!dir];
> + y->a[!dir] = z->a[dir];
> + z->a[!dir] = x;
> + z->a[dir] = y;
> + x->h = hz;
> + y->h = hz;
> + z->h = hz+1;
> + } else {
> + // x y
> + // / \ / \
> + // A y --> x D
> + // / \ / \
> + // z D A z
These comments trigger -Wcomment since they have lines ending in \ in
a //-form comment. I think the easiest solution is converting them to
the style of /**/ comment used elsewhere, with *'s lined up in each
line.
Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.