|
Message-ID: <CAK8P3a0jk736rPueff--Uor=tHmicHZgoikrAsjp0DHxmkaiWg@mail.gmail.com> Date: Tue, 14 Jun 2022 23:11:32 +0200 From: Arnd Bergmann <arnd@...nel.org> To: musl@...ts.openwall.com Subject: Re: Question about musl's time() implementation in time.c On Tue, Jun 14, 2022 at 10:49 PM Rich Felker <dalias@...c.org> wrote: > On Tue, Jun 14, 2022 at 10:37:25PM +0200, Arnd Bergmann wrote: > > On Tue, Jun 14, 2022 at 7:00 PM Rich Felker <dalias@...c.org> wrote: > > > On Tue, Jun 14, 2022 at 06:50:40PM +0200, Arnd Bergmann wrote: > > > > The coarse time can be up to one timer tick behind, so reading > > > > CLOCK_REALTIME first > > > > can give you the exact second with a small nanosecond value, while the > > > > utime will still > > > > set the previous value. > > > > > > > > Can you change the test case to check if the later time is less than > > > > clock_getres(CLOCK_REALTIME_COARSE, ...) behind? > > > > > > This seems like a bug that the kernel uses the wrong clock for setting > > > file timestamps. It can result in seeing events out-of-order (exactly > > > as described in this thread). This should really be fixed or at least > > > made switchable so users who care can fix it. > > > > I can't find any reference to what the correct clock is here, > > are you sure that this is specified at all? The decision to use the coarse > > time in the kernel is definitely intentional, as reading the hardware > > clocksource can be expensive (depending on the hardware), and > > changing the behavior would likely break applications that rely on > > it being the coarse clock. > > POSIX specifies operations that set the file timestamps in terms of > the system (CLOCK_REALTIME) clock, not a weird implementation-defined > alternate clock. > > Maybe you're right that getting the correct clock is costly on some > archs, but it's almost surely not on any arch that admits vdso > clock_gettime. And "race that causes applications to see wrong > ordering of filesystem operations with respect to other activity for > the sake of performance" does not seem like a good idea. The thing is that a lot of file systems would still behave the same way because they round times down to a filesystem specific resolution, often one microsecond or one second, while the kernel time accounting is in nanoseconds. There have been discussions about an interface to find out what the actual resolution on a given mount point is (similar to clock_getres), but that never made it in. The guarantees that you get from file systems at the moment are: - the timestamp is always rounded down, not up, so a newly created file never gets a timestamp that is newer than either CLOCK_REALTIME or CLOCK_REALTIME_COARSE as reported by a subsequent clock_gettime()/gettimeofday()/time(). - the in-memory timestamp is the same that you read back after umount/mount, and gets adjusted for both resolution and range of the on-disk representation. - any file system that supports timestamps (some always report tv_sec=0) set the timestamps to at most three seconds before the current time as read by an earlier time() syscall. Making it use CLOCK_REALTIME instead of CLOCK_REALTIME_COARSE would improve the third guarantee so it could be within two seconds (or one second on file systems with full-second resolution like ext3), but would break the first rule by making it report timestamps that can be either before or after the time reported by the time() syscall. Arnd
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.