|
Message-ID: <08fd37cf-971c-2e6f-1b45-1442566b3416@redhat.com> Date: Mon, 12 Oct 2020 14:18:54 +0200 From: Denys Vlasenko <dvlasenk@...hat.com> To: Rich Felker <dalias@...c.org>, Denys Vlasenko <vda.linux@...glemail.com> Cc: musl@...ts.openwall.com Subject: Re: [PATCH] x86/memset: avoid performing final store twice On 10/11/20 2:25 AM, Rich Felker wrote: > On Sun, Oct 04, 2020 at 12:32:09AM +0200, Denys Vlasenko wrote: >> From: Denys Vlasenko <dvlasenk@...hat.com> >> >> For not very short NBYTES case: >> >> To handle the tail alignment, the code performs a potentially >> misaligned word store to fill the final 8 bytes of the buffer. >> This is done even if the buffer's end is aligned. >> >> Eventually code fills the rest of the buffer, which is a multiple >> of 8 bytes now, with NBYTES / 8 aligned word stores. >> >> However, this means that if NBYTES *was* divisible by 8, >> we store last word too, again. >> >> This patch decrements byte count before dividing it by 8, >> making one less store in "NBYTES is divisible by 8" case, >> and not changing anything in all other cases. >> ... >> --- a/src/string/x86_64/memset.s >> +++ b/src/string/x86_64/memset.s >> @@ -53,7 +53,7 @@ memset: >> 2: test $15,%edi >> mov %rdi,%r8 >> mov %rax,-8(%rdi,%rdx) >> - mov %rdx,%rcx >> + lea -1(%rdx),%rcx >> jnz 2f >> >> 1: shr $3,%rcx >> -- >> 2.25.0 > > Does this have measurably better performance on a system you've tested > it on? I did not test performance, I predict it will hardly be detectable.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.