|
Message-ID: <20190720214602.GA1506@brightrain.aerifal.cx> Date: Sat, 20 Jul 2019 17:46:02 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: time_t progress/findings On Sat, Jul 20, 2019 at 12:48:40AM -0400, Rich Felker wrote: > On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote: > > On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote: > > > Second bit of progress here: stat. First change can be done before any > > > actual time64 work is done or even decided upon: changing all the > > > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as > > > their backend, and changing fstatat to do proper fallbacks if > > > SYS_fstatat is missing. Now there's a single point of potential stat > > > conversion rather than 4 functions. > > > > > > Next, add an internal, arch-provided kstat type and make fstatat > > > translate from this to the public stat type. This eliminates the need > > > for all the mips*/syscall_arch.h hacks. > > > > This step admits a few questions about how to do it best, inspired in > > part by a related question: > > > > What should the new time64 stat structures look like? > > > > There are at least three possible goals: > > > > 1. Make them as clean and uniform as possible, same for all archs. > > > > 2. Avoid increasing the size at all cost so as to maximize > > memory-safety of mismatched interfaces between libc consumers > > defined in terms of struct stat. > > > > 3. Make the start of the new struct match the old struct to minimize > > behavioral errors under mismatched interfaces between libc > > consumers defined in terms of struct stat. > > > > Choice 2 is pretty much out because I think it's impossible on at > > least one arch, and would impose really ugly constraints (making > > timespec 24-byte, relying on non-64bit-alignment) on others. In many > > ways choice 3 is actually more appealing, because when third-party > > libraries *do* use stat in public interfaces, it's usually understood > > that the same party both allocates and fills it in, and shares the > > contents with the other party. > > > > There are actually 2 subvariants of choice 3: either keep exposing the > > 32-bit time in the old locations so that mismatched consumers just > > work, or fill it in with something like INT_MIN (year~=1902) so that > > breakage is caught quickly. > > > > Now, back to kstat and the above-quoted text. If we go with option 3, > > we don't actually need a kstat struct. The existing stat syscalls just > > write into the beginning of the buffer, and then we copy the result to > > the time64 timespecs at the end that make up the new public interface. > > This results in the smallest code, and the least amount of new > > per-arch definitions. But it doesn't clean up the existing mips > > stat-translation hell (currently buried in mips*/syscall_arch.h), and > > it imposes assumptions about the relationship between kernel types and > > public libc types. > > > > On the other hand, if we make archs define a struct kstat and always > > translate everything, the code is a bit larger, but we: > > > > - don't impose any particular choice 1/2/3 above. > > - make it easy to cleanup the mips brokenness. > > - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace > > stat has nothing to do with the legacy kernel stat structs. > > > > So I'm leaning strongly towards just always doing the translation, > > even though I'm also leaning towards choice 3 above that won't require > > it. If nothing else, it allows me to do the prep work that will set > > the stage for time64 transition now, without having finalize the > > decisions about how time64 will look. > > Another data point in favor of choice 3: libc actually has some > functions of its own that pass stat structures to callbacks: ftw and > nftw. With choice 3, these don't need any change; a legacy binary > calling them will get back stat structures it can read (with some > extra 64-bit timespecs afterwards that it's not aware of). With any > other choice, these functions would need painful replacements, and > just wrapping them is not easy because they lack a context argument to > pass through. > > Since similar usage is likely common in third-party library code, I > think this is a really strong argument in favor of choice 3. FWIW the > existing glibc proposal looks like option 1, and they weren't aware of > this problem until I reported it just now. Related find: for struct rusage, we can satisfy both (2) and (3) simultaneously. This means no new structures or symbols are needed for getrusage, wait3, and wait4. The existing 32-bit musl structs left 16 slots for extensibility at the end, so we can just put the new 64-bit time fields there, and still fill in the 32-bit ones too for legacy callers. This is basically what I was already planning for utmp, except that it's less interesting for utmp because the functions are stubs. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.