|
Message-ID: <87mted0yge.ffs@tglx> Date: Thu, 16 Jun 2022 11:06:25 +0200 From: Thomas Gleixner <tglx@...utronix.de> To: Arnd Bergmann <arnd@...nel.org>, musl@...ts.openwall.com Cc: John Stultz <jstultz@...gle.com>, Stephen Boyd <sboyd@...nel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Adhemerval Zanella <adhemerval.zanella@...aro.org> Subject: Re: Question about musl's time() implementation in time.c On Wed, Jun 15 2022 at 14:09, Arnd Bergmann wrote: > On Wed, Jun 15, 2022 at 1:28 AM Rich Felker <dalias@...c.org> wrote: > Adding the kernel timekeeping maintainers to Cc. I think this is a > reasonable argument, but it goes against the current behavior. > > We have four implementations of the time() syscall that one would > commonly encounter: > > - The kernel syscall, using (effectively) CLOCK_REALTIME_COARSE > - The kernel vdso, using (effectively) CLOCK_REALTIME_COARSE > - The glibc interface, calling __clock_gettime64(CLOCK_REALTIME_COARSE, ...) > - The musl interface, calling __clock_gettime64(CLOCK_REALTIME, ...) > > So even if everyone agrees that the musl implementation is the > correct one, I think both linux and glibc are more likely to stick with > the traditional behavior to avoid breaking user space code such as the > libc-test case that Zev brought up initially. At least Adhemerval's > time() implementation in glibc[1] appears to have done this intentionally, > while the Linux implementation has simply never changed this in an > incompatible way since Linux-0.01 added time() and 0.99.13k added > the high-resolution gettimeofday(). That's correct. Assumed this call order: clock_gettime(REALTIME, &tr); clock_gettime(REALTIME_COARSE, &tc); tt = time(); You can observe tr->sec > tc->sec tr->sec > tt but you can never observe tc->sec > tt The reason for this is historical and time() has a distinct performance advantage as it boils down to a single read and does not require the sequence count (at least on 64bit). Coarse REALTIME requires the seqcount, but avoids the hardware read and the larger math. The costy part is the hardware read. Before TSC became usable, the hardware read was a matter of microseconds, so avoiding it was a significant performance gain. With a loop of 1e9 reads (including the loop overhead) as measured with perf on a halfways recent SKL the average per invocation is: time() 7 cycles clock_gettime(REAL_COARSE) 21 cycles clock_gettime(REAL) TSC 60 cycles clock_gettime(REAL) HPET 6092 cycles (~2000 cycles syscall overhead) clock_gettime(REAL) ACPI_PM 4096 cycles (~2000 cycles syscall overhead) So at the very end it boils down to performance and expectations. File systems have chosen their granularity and the underlying mechanism to get the timestamp according to that. It's clearly not well documented, but I doubt that we can change the implementation without running into measurable performance regressions. VDSO based time() vs. clock_gettime(REAL) TSC is almost an order of magnitude... Thanks, tglx
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.