Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a0jk736rPueff--Uor=tHmicHZgoikrAsjp0DHxmkaiWg@mail.gmail.com>
Date: Tue, 14 Jun 2022 23:11:32 +0200
From: Arnd Bergmann <arnd@...nel.org>
To: musl@...ts.openwall.com
Subject: Re: Question about musl's time() implementation in time.c

On Tue, Jun 14, 2022 at 10:49 PM Rich Felker <dalias@...c.org> wrote:
> On Tue, Jun 14, 2022 at 10:37:25PM +0200, Arnd Bergmann wrote:
> > On Tue, Jun 14, 2022 at 7:00 PM Rich Felker <dalias@...c.org> wrote:
> > > On Tue, Jun 14, 2022 at 06:50:40PM +0200, Arnd Bergmann wrote:
> > > > The coarse time can be up to one timer tick behind, so reading
> > > > CLOCK_REALTIME first
> > > > can give you the exact second with a small nanosecond value, while the
> > > > utime will still
> > > > set the previous value.
> > > >
> > > > Can you change the test case to check if the later time is less than
> > > > clock_getres(CLOCK_REALTIME_COARSE, ...) behind?
> > >
> > > This seems like a bug that the kernel uses the wrong clock for setting
> > > file timestamps. It can result in seeing events out-of-order (exactly
> > > as described in this thread). This should really be fixed or at least
> > > made switchable so users who care can fix it.
> >
> > I can't find any reference to what the correct clock is here,
> > are you sure that this is specified at all? The decision to use the coarse
> > time in the kernel is definitely intentional, as reading the hardware
> > clocksource can be expensive (depending on the hardware), and
> > changing the behavior would likely break applications that rely on
> > it being the coarse clock.
>
> POSIX specifies operations that set the file timestamps in terms of
> the system (CLOCK_REALTIME) clock, not a weird implementation-defined
> alternate clock.
>
> Maybe you're right that getting the correct clock is costly on some
> archs, but it's almost surely not on any arch that admits vdso
> clock_gettime. And "race that causes applications to see wrong
> ordering of filesystem operations with respect to other activity for
> the sake of performance" does not seem like a good idea.

The thing is that a lot of file systems would still behave the same way
because they round times down to a filesystem specific resolution,
often one microsecond or one second, while the kernel time accounting
is in nanoseconds. There have been discussions about an interface
to find out what the actual resolution on a given mount point is (similar
to clock_getres), but that never made it in. The guarantees that you
get from file systems at the moment are:

- the timestamp is always rounded down, not up, so a newly
  created file never gets a timestamp that is newer than either
  CLOCK_REALTIME or CLOCK_REALTIME_COARSE as
  reported by a subsequent clock_gettime()/gettimeofday()/time().

- the in-memory timestamp is the same that you read back
  after umount/mount, and gets adjusted for both resolution
  and range of the on-disk representation.

- any file system that supports timestamps (some always
  report tv_sec=0) set the timestamps to at most three
  seconds before the current time as read by an earlier
  time() syscall.

Making it use CLOCK_REALTIME instead of
CLOCK_REALTIME_COARSE would improve the third
guarantee so it could be within two seconds (or one second
on file systems with full-second resolution like ext3), but would
break the first rule by making it report timestamps that can
be either before or after the time reported by the time() syscall.

        Arnd

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.