Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a3WZB81QqAJF1zi0Lp0n2vKhjKhiCS4vcOoVi8jt-Y3aA@mail.gmail.com>
Date: Wed, 15 Jun 2022 14:09:16 +0200
From: Arnd Bergmann <arnd@...nel.org>
To: musl@...ts.openwall.com
Cc: John Stultz <jstultz@...gle.com>, Stephen Boyd <sboyd@...nel.org>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>, 
	Adhemerval Zanella <adhemerval.zanella@...aro.org>
Subject: Re: Question about musl's time() implementation in time.c

On Wed, Jun 15, 2022 at 1:28 AM Rich Felker <dalias@...c.org> wrote:
> On Tue, Jun 14, 2022 at 11:11:32PM +0200, Arnd Bergmann wrote:
> >
> > The thing is that a lot of file systems would still behave the same way
> > because they round times down to a filesystem specific resolution,
> > often one microsecond or one second, while the kernel time accounting
> > is in nanoseconds. There have been discussions about an interface
> > to find out what the actual resolution on a given mount point is (similar
> > to clock_getres), but that never made it in. The guarantees that you
> > get from file systems at the moment are:
>
> It's normal that they may be rounded down the the filesystem timestamp
> granularity. I thought what was going on here was worse.

It gets rounded down twice: first down to the start of the current
timer tick, which is at an arbitrary nanosecond value in the past 10ms,
and then to the resolution of the file system. The result is that the
file timestamp can point to a slightly earlier value, up to max(timer tick
cycle, fs resolution) before the actual nanosecond value. We don't
advertise the granule of the file system though, so I would expect
this to be within the expected behavior.

> OK, the time syscall doing the wrong thing here (using a different
> clock that's not correctly ordered with respect to CLOCK_REALTIME)
> seems to be the worst problem here -- if I'm understanding it right.
> The filesystem issue might be a non-issue if it's truly equivalent to
> just having coarser fs timestamp granularity, which is allowed.

Adding the kernel timekeeping maintainers to Cc. I think this is a
reasonable argument, but it goes against the current behavior.

We have four implementations of the time() syscall that one would
commonly encounter:

- The kernel syscall, using (effectively) CLOCK_REALTIME_COARSE
- The kernel vdso, using (effectively) CLOCK_REALTIME_COARSE
- The glibc interface, calling __clock_gettime64(CLOCK_REALTIME_COARSE, ...)
- The musl interface, calling __clock_gettime64(CLOCK_REALTIME, ...)

So even if everyone agrees that the musl implementation is the
correct one, I think both linux and glibc are more likely to stick with
the traditional behavior to avoid breaking user space code such as the
libc-test case that Zev brought up initially. At least Adhemerval's
time() implementation in glibc[1] appears to have done this intentionally,
while the Linux implementation has simply never changed this in an
incompatible way since Linux-0.01 added time() and 0.99.13k added
the high-resolution gettimeofday().

       Arnd

[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=0d56378349

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.