Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130613014314.GC29800@brightrain.aerifal.cx>
Date: Wed, 12 Jun 2013 21:43:14 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: Thinking about release

On Thu, Jun 13, 2013 at 01:33:16PM +1200, Andre Renaud wrote:
> Hi Rich,
> 
> > Most of the other major items left on the agenda since the last
> > release are probably not going to happen right away unless there's a
> > volunteer to do them (zoneinfo, cpuset/affinity, string functions
> > cleanup, C++ ABI matching, ARM-optimized memcpy) and one, the ld.so
> > symlink direction issue, still requires some serious discussion and
> > decision-making.
> 
> Regarding the ARM-optimisations - I am happy to have a go at providing
> a cleaned up implementation, although I can't recall what the final
> consensus was on how this should be implemented. A simple ARMv4

I think the first step should be benchmarking on real machines.
Somebody tried the asm that was posted and claimed it was no faster
than musl's C code; I don't know the specific hardware they were using
and I don't even recall right off who made the claim or where it was
reported, but I think before we start writing or importing code we
need to have a good idea how the current C code compares in
performance to other "optimized" implementations.

> implementation would cover all the bases, providing near universal
> support, although would obviously not support the more modern
> platforms. Is there any intention to move the base level support up to
> ARMv5? I would consider that reasonable, given the age of ARMv4.
> Alternatively, should we have multiple implementations
> (ARMv4/ARMv5/ARMv7), and choose between them either at compile or
> run-time?

It's possible to branch based on __hwcap at runtime, if this would
really help.

> Obviously this stuff is probably not destined for the immediate
> release, but more likely for the one after that.

Yes, this looks like it will be a process that takes some time to sort
out the facts and then tune the code.

For what it's worth, I just did my first runs of libc-bench on real
ARM hardware (well, an FPGA-based ARM). memset is half the speed of
glibc's, but strchr and strlen are about 40% faster than glibc's. I
don't think libc-bench is really a good benchmark as of yet, so we
should probably develop more detailed tests.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.