|
Message-ID: <20200415155811.GD11469@brightrain.aerifal.cx> Date: Wed, 15 Apr 2020 11:58:11 -0400 From: Rich Felker <dalias@...c.org> To: Florian Weimer <fw@...eb.enyo.de> Cc: Norbert Lange <nolange79@...il.com>, musl@...ts.openwall.com Subject: Re: [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly On Wed, Apr 15, 2020 at 11:50:36AM +0200, Florian Weimer wrote: > * Norbert Lange: > > > How should one deal with this? > > I understand that the semantics are vague, but given that musl now > > implements this > > function, it will make detection and fallback hard (especially as musl > > doesn't wants to be identified by the likes of macros). > > > > As it is now, just using the affinity mask definitely cant be useful, > > an application wanting that behavior should be patched to > > use that function directly. > > If musl would not define the _SC_NPROCESSORS_* macros (but still keep > > the implementation), > > this could be used for compile-time detection atleast. Enabling the > > current implementation would be > > just a matter of explicitly defining those macros. > > _SC_NPROCESSORS_* as implemented in glibc is bad because those values > are not adjusted by cgroups, so it can grossly overestimate available > resources. > > The cgroups interfaces themselves are not stable and very complicated. > I don't think it's a good idea to target them, especially not from > code that is expected to be linked statically into applications. > > Given that, I'm not sure that glibc's way is a significant > improvement. musl should perhaps be changed to cope more gracefully > with a sched_getaffinity failure, though (by not reporting a UP > environment by accident). For what it's worth, even without the sched_getaffinity failure, it's still problematic for programs linked to musl to be using the values obtained to omit memory barriers since they may be restricted to a single core themselves but communicating over shared memory with another process that's not restricted or restricted to a different core. There really should be some documented meaning for the return values, whereby we decide either that such sketchy application usage is supported (e.g. document that values less than 2 are never returned, so that applications doing the hack always use barriers and they have no remaining documented way to determine it's really a UP environment) or declare the application usage incorrect/buggy (i.e. that the values may be specific to the cgroup or other resource-constraints (possibly virtualized) and can't be relied on if you're communicating with processes that might live outside those resource constraints). Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.