Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFJhRnrWPC6pk67Xo9A9EeHFhoJCkE2PfwS4wFUep2JS3D9ujQ@mail.gmail.com>
Date: Fri, 10 Jun 2022 11:52:50 +0300
From: Zev Levy Stevenson <zevlevys@...il.com>
To: Rich Felker <dalias@...c.org>
Cc: Arnd Bergmann <arnd@...nel.org>, musl@...ts.openwall.com
Subject: Re: Question about musl's time() implementation in time.c

Thank you for the responses, those reasons make sense to me. We are using a
very customized toolchain but the kernel itself is standard.
We looked into it a bit further and we were able to reproduce the issue
with a clean musl-gcc toolchain for x86_64 (version 1.2.2) on a Linux
kernel that we took from a standard Ubuntu distribution.
Specifically, tests in the libc-test suite (
https://wiki.musl-libc.org/libc-test.html) using the time() function fail
sometimes, e.g. src/functional/utime.c, which fails on about ~3-4 runs in
every 1,000 runs. This can be reduced to this type of code failing:

t = time(0);
if(futimens(fd, ((struct
timespec[2]){{.tv_nsec=UTIME_NOW},{.tv_nsec=UTIME_OMIT}})) != 0) return 1;
if (fstat(fd, &st) != 0) return 1;
if (st.st_atim.tv_sec < t) printf("time inconsistency\n");

When replacing the call to time(0) with a raw call to the Linux time()
syscall the issue seems to disappear. On the other hand, using the
clock_gettime syscall results in the same issue.
Perhaps this is an issue with the Linux implementation of these syscalls /
vdso functions, in which case further research may be required, or maybe
such consistency when using different methods for measuring the system time
doesn't have to be guaranteed, in which case the tests should probably be
modified to allow for small inaccuracies such as the one described above.

On Tue, Jun 7, 2022, 19:30 Rich Felker <dalias@...c.org> wrote:

> On Tue, Jun 07, 2022 at 04:29:28PM +0200, Arnd Bergmann wrote:
> > On Tue, Jun 7, 2022 at 2:25 PM Zev Levy Stevenson <zevlevys@...il.com>
> wrote:
> > >
> > > Hi all,
> > >
> > > While running the libc-test test suite on a customized clang+musl
> > > build, I had trouble with some of the tests because of issues with
> > > time accuracy.
> > > I can go in detail if needed, but the problem seemed to boil down
> > > to the time() function in musl (in src/time/time.c) using a
> > > clock_gettime syscall (without vdso) instead of using the Linux
> > > time syscall that we expected it to use. Some other libc
> > > implementations use this syscall, and indeed after switching the
> > > syscall used in time () the tests passed, seemingly because the
> > > accuracy of the clocks used matched up.
> > > My main question is why musl's implementation doesn't use the time
> > > syscall, I'd be happy to hear if there was a special reason for
> > > this.
> >
> > The time() syscall on 32-bit architectures returns a 32-bit integer,
> > which overflows in y2038, only
> > clock_gettime() has the required range.
>
> This is indeed a good reason it can't be changed. Historically, though
> it was just a matter of avoiding code duplication. Due to the desire
> to support vdso and, clock_gettime requires a good deal of logic to
> find and use the vdso function and perform fallbacks if it's not
> available, or if the newer syscalls are not available. If time() did
> not use clock_gettime as its backend, but instead used separate kernel
> interfaces, this logic would need to be duplicated in time() too. It
> would also impose weird incentives for new archs to provide a time()
> syscall or vdso function, or would impose a requirement that we *also*
> duplicate the vdso logic to consume the vdso clock_gettime in time()
> if there's no vdso time().
>
> If you've created an alternate kernel/syscall implementation where
> clock_gettime behaves badly and a legacy time syscall (or vdso
> function?) behaves good, that really doesn't seem like a good
> implementation choice. Especially if they produce mismatching output.
> I guess you could make a stretch argument that the implementation
> behaves as if someone is constantly changing the clock, but short of
> that, think it's even nonconforming (there's a single realtime clock
> and they're both supposed to return times in terms of it). Are there
> reasons you're trying to do things that way?
>
> Rich
>

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.