|
Message-ID: <20190720044840.GZ1506@brightrain.aerifal.cx> Date: Sat, 20 Jul 2019 00:48:40 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: time_t progress/findings On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote: > On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote: > > Second bit of progress here: stat. First change can be done before any > > actual time64 work is done or even decided upon: changing all the > > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as > > their backend, and changing fstatat to do proper fallbacks if > > SYS_fstatat is missing. Now there's a single point of potential stat > > conversion rather than 4 functions. > > > > Next, add an internal, arch-provided kstat type and make fstatat > > translate from this to the public stat type. This eliminates the need > > for all the mips*/syscall_arch.h hacks. > > This step admits a few questions about how to do it best, inspired in > part by a related question: > > What should the new time64 stat structures look like? > > There are at least three possible goals: > > 1. Make them as clean and uniform as possible, same for all archs. > > 2. Avoid increasing the size at all cost so as to maximize > memory-safety of mismatched interfaces between libc consumers > defined in terms of struct stat. > > 3. Make the start of the new struct match the old struct to minimize > behavioral errors under mismatched interfaces between libc > consumers defined in terms of struct stat. > > Choice 2 is pretty much out because I think it's impossible on at > least one arch, and would impose really ugly constraints (making > timespec 24-byte, relying on non-64bit-alignment) on others. In many > ways choice 3 is actually more appealing, because when third-party > libraries *do* use stat in public interfaces, it's usually understood > that the same party both allocates and fills it in, and shares the > contents with the other party. > > There are actually 2 subvariants of choice 3: either keep exposing the > 32-bit time in the old locations so that mismatched consumers just > work, or fill it in with something like INT_MIN (year~=1902) so that > breakage is caught quickly. > > Now, back to kstat and the above-quoted text. If we go with option 3, > we don't actually need a kstat struct. The existing stat syscalls just > write into the beginning of the buffer, and then we copy the result to > the time64 timespecs at the end that make up the new public interface. > This results in the smallest code, and the least amount of new > per-arch definitions. But it doesn't clean up the existing mips > stat-translation hell (currently buried in mips*/syscall_arch.h), and > it imposes assumptions about the relationship between kernel types and > public libc types. > > On the other hand, if we make archs define a struct kstat and always > translate everything, the code is a bit larger, but we: > > - don't impose any particular choice 1/2/3 above. > - make it easy to cleanup the mips brokenness. > - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace > stat has nothing to do with the legacy kernel stat structs. > > So I'm leaning strongly towards just always doing the translation, > even though I'm also leaning towards choice 3 above that won't require > it. If nothing else, it allows me to do the prep work that will set > the stage for time64 transition now, without having finalize the > decisions about how time64 will look. Another data point in favor of choice 3: libc actually has some functions of its own that pass stat structures to callbacks: ftw and nftw. With choice 3, these don't need any change; a legacy binary calling them will get back stat structures it can read (with some extra 64-bit timespecs afterwards that it's not aware of). With any other choice, these functions would need painful replacements, and just wrapping them is not easy because they lack a context argument to pass through. Since similar usage is likely common in third-party library code, I think this is a really strong argument in favor of choice 3. FWIW the existing glibc proposal looks like option 1, and they weren't aware of this problem until I reported it just now. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.