Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151207130344.GZ23362@port70.net>
Date: Mon, 7 Dec 2015 14:03:44 +0100
From: Szabolcs Nagy <nsz@...t70.net>
To: Ed Schouten <ed@...i.nl>
Cc: musl@...ts.openwall.com
Subject: Re: AVL tree: storing balances instead of heights

* Ed Schouten <ed@...i.nl> [2015-12-07 09:46:54 +0100]:
> Hi Szabolcs,
> 
> Thanks again for your quick response to the bug in tdelete() that I
> reported! I ran into this issue because I was comparing
> implementations of tsearch() and tdelete() across different operating
> systems. I was doing this in preparation for adding these functions to
> CloudABI's C library.
> 
> I noticed that musl's implementation explicitly stores the height of
> the elements to compute the balance factor. What I read the other day
> is that this is not strictly necessary. It turns out that just storing
> the balances is sufficient. Though this required me to spend some time
> making notes to distinguish all of the states individually, it looks
> like the implementation itself becomes a lot simpler. This approach
> also makes it possible to more easily stop rebalancing as soon as a
> subtree has become balanced again.
> 
> Below are links to my implementation of these functions:
> 
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tsearch.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/tdelete.c
> https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/search/search_impl.h
> 
> If you like them, feel free to include them in musl as well. They are
> currently 2-clause BSD licensed, but I don't mind making them
> available under the MIT license as well if this is more practical for
> you folks.
> 

adding musl@ as it should be discussed there.

yes it is enough to only store the height difference.

note that this code was supposed to be size optimized in musl:
i think the tsearch api is not well designed and thus rarely
useful.

your balancing approach is better (i wouldn't call it a lot
simpler, but it is more efficient).

musl should probably move tfind and twalk into separate tu
(but tsearch and tdelete need the same balancing logic so
i'd keep those together).

i think the macro definitions for inlining twalk and tfind
are not justified.

you got the rootp==0 case wrong too (posix requires that to
return 0).

and the (T**) cast is invalid in

	void *tdelete(const void *restrict key, void **restrict rootp,
	              int (*compar)(const void *, const void *)) {
	  void *result = (void *)1;
	  tdelete_recurse(key, (struct __tnode **)rootp, compar, &result);
	  return result;
	}

posix specifies to return a pointer to a node, not to an element
pointer, i think that's a bug in posix (otherwise the api would
be useless).

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.