|
Message-ID: <20190722155259.GA7445@brightrain.aerifal.cx> Date: Mon, 22 Jul 2019 11:52:59 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: Removing glibc from the musl .2 ABI On Wed, Jul 17, 2019 at 02:16:51PM -0400, Rich Felker wrote: > On Wed, Jul 17, 2019 at 01:10:19PM -0500, A. Wilcox wrote: > > >> Just trying to make sure the community has a clear view of what this > > >> looks like before we jump in. > > > > > > Yes. This isn't a request to jump in, just looking at feasability and > > > whether there'd be interest from your side. Being that ABI-compat > > > doesn't actually work very well without gcompat right now, though, I > > > think it might make sense. I'll continue to look at whether there are > > > other options, possibly just transitional, that might be good too. > > > > I meant: I want a clear view of the boundaries between musl and gcompat, > > before we (Adélie / the gcompat team) jump in and start designing how we > > want to handle all the new symbols we may end up with :) > > If we go this route, I would think that gcompat could provide all > symbols which are not either public APIs (extensions you can > legitimately use in source) or musl-header-induced ABIs (for example > things like __ctype_get_mb_cur_max, which is used to define the > MB_CUR_MAX macro). This would include LFS64 as well as the "__xstat" > stuff, the other __ctype_* stuff, etc. I think I'd like to go foward with this. Further work on time64 has made it apparent to me that the current glibc ABI-compat we have inside musl is fragile and is imposing unwanted constraints on musl, which has long been one of the criteria for exclusion. In particular, consider this situation: Several structures that are part of public interfaces in musl were created with extra space reserved for future extension. In some cases the reserved space was added by musl; in other cases glibc had the same. However, if we mandate glibc ABI-compat, *all* of this reserved space is permanently unusable: - If the reserved space is specific to musl, then reads from it may fault, and stores to it may clobber unrelated memory, if the structure was allocated by glibc-linked code. - If the reserved space is present in both musl and glibc, we can't make use of it without risking that glibc makes some different use of it in the future, making calls from glibc-linked code dangerous. This came up in the context of structs rusage and timex, but also applies to stat, sched_param, sysinfo, statvfs, and perhaps others, which might have reason for wanting extensibility in the future. Right now, without the glibc ABI-compat constraint, getrusage, wait3, and wait4 can avoid new time64 remappings entirely (by using the reserved space we already have in rusage, which glibc doesn't have at all). [clock_]adjtime[x] hit the second case -- glibc also has reserved space in timex, but if they end up wanting to use it for something else and we've put the 64-bit time there, we may be in trouble. I don't think the rusage and timex issues here are compelling by themselves. It's not a big deal to make compat shims here, and I might still end up doing it. But I think it's indicative that maintaining glibc ABI-compat in musl is going to become increasingly problematic. So, what I'd (tentatively; for discussion) like to do: When ldso loads an application or shared library and detects that it's glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat library instead *and* flags the dso as needing ABI-compat. The gcompat library would be permanently RTLD_LOCAL, unable to be used for resolving global symbols, since it would have to define symbols conflicting with libc symbols names and with future directions of the musl ABI. Symbol lookups when relocating such a flagged dso would take place by first processing gcompat (logically, adding it to the head of the dso search list), then the normal symbol search order. The gcompat library could also provide a replacement dlsym function, so that dlsym calls from the glibc-linked DSO also follow this order, and a replacement dlopen, so that dlopen of libc from the glibc-linked DSO would get the gcompat module. I'm not sure what mechanism gcompat would then use to make its own references to the underlying real libc functions. This is something we'd need to think about. Before we decide to do it, please be aware that this would be a bit of a burden on gcompat to do more than it's doing now. But it would also make lots of cases work that fundamentally *can't* work now -- compat with 32-bit code using the legacy 32-bit off_t functions, compat with 64-bit code using regexec, etc. -- anywhere the musl ABI currently conflicts with the glibc ABI. Of course much of this is optional. The new things that would be mandatory would mainly be moving over existing glibc compat shims (like the __ctype and __xstat stuff) and implementing converting wrappers where musl's use of reserved space creates unsafety/incompatibility with the existing glibc code. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.