|
Message-ID: <a9fe2b30-9253-9120-3627-aed4ebf95973@efficios.com> Date: Sat, 17 Sep 2022 13:51:44 +0200 From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com> To: Florian Weimer <fw@...eb.enyo.de>, Chris Kennelly <ckennelly@...gle.com> Cc: libc-coord@...ts.openwall.com, "carlos@...hat.com" <carlos@...hat.com>, libc-alpha <libc-alpha@...rceware.org>, szabolcs.nagy@....com Subject: Re: Re: RSEQ symbols: __rseq_size, __rseq_flags vs __rseq_feature_size On 2022-09-16 23:32, Florian Weimer wrote: > * Chris Kennelly: > >>> If the kernel does not currently overwrite the padding, glibc can do >>> its own per-thread initialization there to support its malloc >>> implementation (because the padding is undefined today from an >>> application perspective). That is, we would initialize these >>> invisible vCPU IDs the same way we assign arenas today. That would >>> cover this specific malloc use case only, of course. > >> If a user program updates to a new kernel before glibc does, would it be >> able to easily take advantage of it? > > No, as far as I understand it, there is presently no signaling from > kernel to applications that bypasses the rseq area registration. So > the only thing you could do is to unregister and re-register with a > compatible value. And that is of course quite undefined and assumes > that you can do this early enough during the life-time of each thread. > > But if we have the extension handshake, I'll expect us to backport it > quite widely, after some time to verify that it works with CRIU etc. I don't think this is what Chris is asking here. I think the requirement here is to make sure that the extensibility scheme we come up with will allow to extend struct rseq simply by upgrading the kernel, without any need to upgrade glibc. (that's indeed a requirement of mine). So a new application and a new kernel can use a newly available extended field, even with an old glibc. Let me bring an example of what I think would be a *bad* way to do things, just to show how we can shoot ourselves in the foot if we don't consider evolution of this ABI carefully. Let's assume we expose a "rseq_feature_size" integer through getauxval(). This allows the kernel to tell glibc about the memory size required to hold all the rseq features. This is information that we _need_ to expose from the kernel to glibc. So if glibc decides to expose each new features through __rseq_flags bits (e.g. one bit per feature), then we run into a situation where for every new feature exposed by the kernel, glibc needs to know the mapping from feature size to feature bit before it can expose them to the rest of user-space, which goes against the requirement that we should be able to extend rseq features by simply upgrading the kernel, without needing to upgrade glibc as well every time. So considering that the kernel needs to let glibc know how much memory to allocate for struct rseq, a getauxval() "rseq_feature_size" is needed. One approach we could consider to allow extending rseq features without upgrading glibc would be to expose an additional "rseq_feature_flags" getauxval(), which could then be used by glibc to populate its __rseq_flags symbol without prior knowledge of the feature-set. This could accommodate 32 features before we need to expose an additional __rseq_flags2 symbol. Exposing a feature flag from the kernel through getauxval() would have the advantage to allow the kernel to "disable" some features in the future, e.g. if we want to deprecate a field. This comes with its own complexity though, as user-space could then not rely that when a feature is present, all prior feature fields are necessarily present, which therefore makes the testing matrix more complex. I personally don't see a need to deprecate rseq fields, but it might just be a lack of imagination on my part. If we want to keep the kernel ABI as simple as we can, then we just expose the rseq feature size (and required alignment), and don't expose any rseq feature flags. This in turn means that glibc would have to somehow expose the rseq feature size in its ABI. If glibc decides against exposing this rseq feature size symbol, then it would be up to the application to combine information about __rseq_size and getauxval(rseq feature size) to figure out which fields are actually populated. It would "work", but chances are that some users will get it wrong. It seems simpler for a user to simply do: if (__rseq_feature_size >= offsetofend(struct rseq, vm_vcpu_id)) to validate whether a given field is indeed populated. The rseq feature size approach would scale to very large feature numbers. It would *not* allow deprecation of fields after they are published, but I see this as a gain in simplicity for users of the ABI, even though we lose a knob as kernel developers. I think it's important that we consider both the kernel and libc ABIs if we want to make sure that we can extend the feature-set without having a mandatory glibc upgrade in the way every time we add a rseq feature. Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.