|
Message-ID: <CAA-4+jdC+_82vjFhFUvC0HwdySV_VB7=xcmGNe+6EmRetOOUig@mail.gmail.com> Date: Mon, 14 Mar 2016 13:24:11 +0900 From: Masanori Ogino <masanori.ogino@...il.com> To: musl@...ts.openwall.com Subject: Re: musl without atomic instructions? 2016-03-14 12:43 GMT+09:00 Rich Felker <dalias@...c.org>: > On Mon, Mar 14, 2016 at 11:55:15AM +0900, Masanori Ogino wrote: >> Well, it seems that I don't really understand vDSO. > > The way vdso works is that the kernel contains an image of a small ELF > shared library file, and maps it into the virtual address space of > each user process, and exposes its address as part of the "aux vector" > that the dynamic linker or main program entry point receives and can > process. > > While anything could be included in the vdso, normally what the kernel > puts there are functions that allow userspace to bypass actually > making a system call for some things that _can_ be done without a > system call (no need for kernel privs) but where the _way_ to do them > is only known by the kernel (e.g. hardware model specific, or > dependent on memory structures the kernel writes and exposes to > userspace but does not guarantee stability for). Some examples are > time/gettimeofday/clock_gettime, getcpu, etc. > > If userspace chooses to use the vdso, it does symbol lookups in it > using the same mechanisms used for dynamic library symbol lookup, then > calls the resulting function instead of making a syscall. OK, it is getting clear to me now. Thank you. >> My current understanding is, vDSO make it possible that: >> >> 1. programs targeting without-A processors use syscalls on without-A >> processors, and >> 2. the programs use atomic instructions on with-A processors. (no >> interruption, no context switching!) >> (3. programs targeting with-A processors runs normally, without >> calling such vDSO function) >> >> Is it correct? If so, it would be really nice. > > Even better. > > Indeed, a baseline vdso-based compare-and-swap for riscv would look > like your above items 1 and 2, and item 3 if you build binaries that > depend on a processor with the "A" option. > > But in the future, for non-SMP setups, case 1 could be replaced with a > scheduler-based restart approach like pre-v6 ARM and SH3/SH4 use, > yielding a huge performance boost (maybe around 100x speedup in > locking/atomics). The way this works is that, when resuming a task > that was preempted, the scheduler just has to check if the program > counter is in the cas function in the vdso. If so, it resets the > program counter to the start of that function before resuming > userspace. At one point there was a good article on how the ARM > implementation of this works, but I can't find it right now. Fantastic! I will append this to the work list. It is really worthwhile to work on. -- Masanori Ogino
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.