|
|
Message-ID: <CAPfzE3b3w+wS2RCE_nncAp6c2_TN6Qh9JZWe-At26mqSd+YLjQ@mail.gmail.com>
Date: Wed, 24 Jul 2013 16:40:16 +1200
From: Andre Renaud <andre@...ewatersys.com>
To: musl@...ts.openwall.com
Subject: Re: Thinking about release
Hi Rich,
> It looks buggy as-is; as far as I can tell, it will crash if src/dest
> are aligned with respect to each other but not aligned mod 4, i.e. the
> code starts out copying word-at-a-time rather than byte-at-a-time.
Yes, you are correct, I'd messed that up while looking at the cache
alignment stuff (along with anoter small size related bug). Fixing it
is relatively straight forward though:
#define SS sizeof(size_t)
#define ALIGN (SS - 1)
void * noinline my_asm_memcpy(void * restrict dest, const void *
restrict src, size_t n)
{
unsigned char *d = dest;
const unsigned char *s = src;
if (((uintptr_t)d & ALIGN) != ((uintptr_t)s & ALIGN))
goto misaligned;
/* Get them word aligned */
for (; ((uintptr_t)d & ALIGN) && n; n--) *d++ = *s++;
/* ARM has 32-byte cache lines, so align to that for performance */
for (; ((uintptr_t)d & ((8 * SS) - 1)) && n >= SS; n-=SS) {
*(size_t *)d = *(size_t *)s;
d += SS;
s += SS;
}
/* Do full cache line read/writes */
for (; n>=(8 * SS); n-= (8 * SS))
__asm__ __volatile__(
"ldmia %1!,{a4,v1,v2,v3,v4,v5,v6,v7}\n\t"
"ldrhi r12, [%1]\n"
"stmia %0!,{a4,v1,v2,v3,v4,v5,v6,v7}\n\t"
: "+r"(d), "+r"(s) :
: "a4", "v1", "v2", "v3", "v4", "v5",
"v6", "v7", "r12", "memory");
misaligned:
for (; n; n--) *d++ = *s++;
return dest;
}
> I think the C version would be acceptable if we get the bugs fixed and
> test it well, but I'd also like to still keep the asm under
> consideration. There are lots of cases not covered by the C version,
> like misaligned copies (important for strings, not for much else). Do
> you think these cases are important?
At the moment the mis-aligned copies perform terribly (18MB/s vs glibc
@ 100MB/s). However the existing C implementation in musl is no
different, so we're not degrading the current system.
We're essentially missing the non-congruent copying stuff from the asm
code. I'll have a look at this and see if I can write a similar C
version.
Regards,
Andre
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.