|
Message-ID: <20190316142819.GE6994@joraj-alpa> Date: Sat, 16 Mar 2019 10:28:19 -0400 From: Jonathan Rajotte-Julien <jonathan.rajotte-julien@...icios.com> To: musl@...ts.openwall.com, Michael Jeanson <mjeanson@...icios.com>, Richard Purdie <richard.purdie@...uxfoundation.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com> Subject: Re: sysconf(_SC_NPROCESSORS_CONF) returns the wrong value > > A simple command line to show this: > > > > taskset -c 0 nproc --all > > > > This is equivalent to asking sysconf(__SC_NPROCESSORS_CONF). > > the right way to check the sysconf from a shell is getconf It was only to provide an easy reproducer. But you are right in that nproc does not expose the complete picture. Thanks for taking the time to reproduce the base problem. I mixed up the _NPROCESSORS_ONLN result for glibc, it should have been 4 in the previous email since there is 4 online cpu even if we have sched_affinity only set for cpu0. (nproc was not proving my point for the _NPROCESSORS_ONLN value) As you know, you can take a cpu offline easily, but we still need to account for it in userspace tracing since it can be put back online (see Mathieu Desnoyers answer in this thread). echo 0 > /sys/devices/system/cpu/cpu3/online Now on a glibc system: $ taskset -c 0 getconf -a |grep NPROC _NPROCESSORS_CONF 4 _NPROCESSORS_ONLN 3 This is why we use _NPROCESSORS_CONF and expect it to represent the complete picture. We do not care much for _NPROCESSORS_ONLN or affinity. This is why the use of "nproc --all" was sufficient for me (I was wrong). > > on glibc system > $ taskset -c 0 getconf -a |grep NPROC > _NPROCESSORS_CONF 8 > _NPROCESSORS_ONLN 8 > > on musl > $ taskset -c 0 getconf -a |grep NPROC > _NPROCESSORS_CONF 1 > _NPROCESSORS_ONLN 1 > > so both values differ (plain nproc returns the affinity number, > *_ONLN is all the cpus that the kernel schedules to, *_CONF > includes offline cpus that may be hotplugged) > > these are documented linux extensions so i think musl should follow > the linux sysconf man page. (but the semantics is not entirely clear > e.g. there is /sys/devices/system/cpu/possible which can have larger > number than echo /sys/devices/system/cpu/cpu[0-9]* |wc -w which is > what glibc seems to be doing for *_CONF) > > i think we need to know why does a process care if musl returns > the wrong number? or what are the valid uses of such a number? > (there are heterogeous systems like arm big-little, numa systems > with many sockets, containers, virtualization,.. how deep may a > user process need to go down in this rabbit hole?) I'll refer you to Mathieu Desnoyers answer regarding that. (same thread). It should be approved shortly by a moderator. Cheers -- Jonathan Rajotte-Julien EfficiOS
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.