|
Message-ID: <20150210205458.GL23507@brightrain.aerifal.cx> Date: Tue, 10 Feb 2015 15:54:58 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Re: [PATCH] x86_64/memset: simple optimizations On Tue, Feb 10, 2015 at 09:52:54PM +0100, Denys Vlasenko wrote: > On Tue, Feb 10, 2015 at 9:43 PM, Rich Felker <dalias@...ifal.cx> wrote: > > On Tue, Feb 10, 2015 at 09:27:17PM +0100, Denys Vlasenko wrote: > >> On Sat, Feb 7, 2015 at 2:06 PM, Rich Felker <dalias@...ifal.cx> wrote: > >> /* libc has incredibly messy way of doing this, > >> * typically requiring -lrt. We just skip all this mess */ > >> static void get_mono(struct timespec *ts) > >> { > >> syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts); > >> } > > > > FWIW, this is a bad idea; you get syscall overhead in your > > measurements. If you just use clock_gettime (the function) you'll get > > vdso results (no syscall). > > I repeat memset 32 times between reading timespamp. > Thus, even with "small" 20kb memset test > there are 640kb of writes to L1. This is bit enough > to make overhead insignificant. Yes, I agree it's probably okay the way you've structured the test here; that's why I mentioned it as a "FWIW" rather than an objection to the results. It was more an aside remark about how this technique could be problematic in the future. Sorry for not being clear. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.