|
Message-ID: <20240428161356.GB10433@brightrain.aerifal.cx> Date: Sun, 28 Apr 2024 12:13:56 -0400 From: Rich Felker <dalias@...c.org> To: lolzery wowzery <wowzeryest@...il.com> Cc: musl@...ts.openwall.com, Duncan Bellamy <dunk@...kimushi.com>, info@...ordhuis.nl, tony.ambardar@...il.co Subject: Re: [PATCH 1/2] V3 resubmitting old statx patch with changes On Sat, Apr 27, 2024 at 10:29:35PM -0400, lolzery wowzery wrote: > Hi, > > Update: you've given me a lot to think about in my suggestions and > proposals to musl and, because you took time to respond to me and > explain things clearly, I feel much more compelled to put good effort > into making sure my changes are solid and indisputably beneficial (not > as in more=better kind of way but less=more but keeping in mind your > responses about musl's design.) > > > statx is already in musl 1.2.5 (commit b817541f1cfd). If there are > > other problems than the ones I fixed when merging it and the > > stx_attributes field, please report them here rather than spending > > time trying to make a comprehensive changeset for a lot of things that > > might or might not actually be wrong. > > I must have dyslexia or something because it turned out my efforts were a > wild goose chase for xstat, not statx! I'm so embarrassed, ha ha. > > > > symbol > > > weakness problems, > > > > This sounds unlikely. It's more likely that you misunderstand how/why > > they're used. But I'm happy to look at your findings. > > There are some symbol weakness inconsistencies with glibc I found in musl There is nothing about weakness that is a public interface. It does nothing at all in dynamic linking, and in static linking, it does not declare an intent that the application can override/redefine the function, only that libc not conflict with the application's namespace *if the libc-provided function is not being used at all* because the application is using a different namespace profile, like plain C instead of POSIX or POSIX instead of POSIX+extensions. If you review this I think you'll find they're all correct. The other place they're used is for controlling link dependencies in static linking and avoiding pulling in code that's not used/needed because other functionality was not linked. > but trying to track them all down by hand would be insane. I will get together > a tool to difference musl's and glibc symbols and list all changes to you one > of these days. > > For reference, symbol weakness only affects what happens when linking two > libraries with same symbol names, which is used to override libc methods > for various necessary purposes. > Glibc is the golden standard software is > built to link to, No, just no. This is not a premise that is acceptable here. musl is not a glibc clone/drop-in replacement. musl and glibc are *different systems*. musl implements certain standards (C, POSIX, IEEE754) and selected nonstandard extensions based on certain criteria. We do not do anything "because glibc does it and glibc is the 'golden standard'". > so, if anything, this will only help some software > work in musl. Facilitating hacks that involve UB and poking at implementation internals is generally not a goal of musl. > > Can you clarify what you mean? There are some places where correctly > > atomic fallback is impossible and the fallback is best-effort only. > > This is generally only for missing O_CLOEXEC type functionality. If > > there are others, please report. > > My original thinking was that my proposed solution of trying both the full > syscall and the fallback and seeing which work to handle seccomp would > introduce a race condition where the file is created between the two calls. I don't see how that matters. If the fallback is correct, either is correct for at least one moment in time and there is no distinction except an ordering that's not synchronized and thereby arbitrary. > > It's never undefined. That's not how this works. > > > > If you're making out-of-tree bare-metal ports and don't want the > > overhead of having to add new syscalls like this, you can do something > > like define the SYS_* macros for them such that syscall_arch.h can > > statically catch that they're nonexistant (e.g. by given them values > > in some high range) and directly return -ENOSYS; then the code would > > collapse down as you want with the ENOSYS path being always-taken. > > > > Note that there is no way to emulate the nonzero flags for renameat2 > > without race conditions. Since this is not standard/mandatory > > functionality, the right thing to do is just return an error for > > "unsupported flags", not try to emulate them. > > Got it and thanks for explaining. Now that I've fixed my eyes and am > looking at statx, I see four main things to be fixed: > 1. Remove the `#ifndef SYS_fstatat` because it's nonsensible No, riscv32 does not have (and no new 32-bit archs will have) SYS_fstatat because they require SYS_statx for time64 support (they don't have a 32-bit native kernel stat structure). The conditional is not necessary but it just optimizes out dead code. On such archs, it's known that if SYS_statx fails, fstatat() will also fail for the same reason, because it makes the same syscall. > 2. Add comments to explain things This is possibly okay, if they're explaining reasons for doing things and not just translating C into a natural-language description of what it's doing. > 3. Correctly validate the flags and EINVAL if unsupported by fallback fstatat() does that already. There isn't really any way to do the validation in userspace without blocking access to newly-added kernel functionality. > 4. Zero all extraneous fields like __pad1 for future proofing. This is probably a good idea, but it should be done via zero-initializing the whole structure before filling it, not referring to those fields by name. The names are not a public or even libc-internal-private interface, but placeholders, and shouldn't be used. > 5. The stx_rdev_major and stx_rdev_minor fields were not correctly filled in Another thing I missed on the initial review. Thanks for catching it. > Please do not make these 5 changes yourself yet as I might find more and > I have some great comments I want to add to explain why things are. If you'd like to submit the fixes, please do them as individual changes with commit messages that explain what was wrong and what specifically is being fixed, not a big combined "fix statx fallback" patch. > I also discovered and would like to do a very minor cleanup on > src/stat/fstatat.c, which has a duplicate copy of the struct statx for > no reason. That's because the patch 2 in this series as submitted was wrong and never fixed, so I refrained from applying it, with the intent to revisit after release. So that would be fine to do now. > Additionally, I am working on a proper fallback and implementation for fstatx > and lstatx, which are the open-fd and symlink equivalents of statx. It will statx already supports those usages with proper flags (AT_SYMLINK_NOFOLLOW, AT_EMPTY_PATH). Unless there's precedent for functions by those names that wrap it with the necessary flags, I don't see a motivation for adding them rather than just writing application code to use the flags with statx(). > > These functions return information about a file, in the buffer > > pointed to by statbuf. No permissions are required on the file > > itself, but—in the case of stat(), fstatat(), and lstat()—execute > > (search) permission is required on all of the directories in > > pathname that lead to the file. > > I swear I've spent the last 5 hours digging into the nitty gritty > depths of these > little-documented methods to ensure musl will have proper fallbacks. > > QUESTION: The glibc wrapper for statx explicitly sets the errno to ENOSYS > upon successful execution of its fallback to fstat64 (yes you read that right.) > I swore I misread something or this was a bug until quadruple checking > that this is works-as-intended over the next 2 hours. Will check around > glibc more tomorrow but what should musl's policy be about this (and > perhaps other works-as-intended POSIX violations around glibc?) I'm > personally strongly leaning towards do-what-glibc-does for compatibility The value of errno on success is not meaningful; it's valid for it to take on any nonzero value as a consequence of a function call that succeeds. Our policy in general if offering interfaces modelled off glibc (normally only happens when these are Linux-specific interfaces) is that only properties which an application can reasonably expect to rely on need to be matched. Things like the value of errno after success would not fall under that. Even though POSIX does not govern nonstandard interfaces like this, the principle that errno is not meaningful in this case still applies. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.