Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1587004907.ioxh0bxsln.astroid@bobo.none>
Date: Thu, 16 Apr 2020 12:53:31 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: Rich Felker <dalias@...c.org>
Cc: libc-alpha@...rceware.org, libc-dev@...ts.llvm.org,
	linuxppc-dev@...ts.ozlabs.org, musl@...ts.openwall.com, Segher Boessenkool
	<segher@...nel.crashing.org>
Subject: Re: Powerpc Linux 'scv' system call ABI proposal take 2

Excerpts from Rich Felker's message of April 16, 2020 12:35 pm:
> On Thu, Apr 16, 2020 at 12:24:16PM +1000, Nicholas Piggin wrote:
>> >> > Likewise, it's not useful to have different error return mechanisms
>> >> > because the caller just has to branch to support both (or the
>> >> > kernel-provided stub just has to emulate one for it; that could work
>> >> > if you really want to change the bad existing convention).
>> >> > 
>> >> > Thoughts?
>> >> 
>> >> The existing convention has to change somewhat because of the clobbers,
>> >> so I thought we could change the error return at the same time. I'm
>> >> open to not changing it and using CR0[SO], but others liked the idea.
>> >> Pro: it matches sc and vsyscall. Con: it's different from other common
>> >> archs. Performnce-wise it would really be a wash -- cost of conditional
>> >> branch is not the cmp but the mispredict.
>> > 
>> > If you do the branch on hwcap at each syscall, then you significantly
>> > increase code size of every syscall point, likely turning a bunch of
>> > trivial functions that didn't need stack frames into ones that do. You
>> > also potentially make them need a TOC pointer. Making them all just do
>> > an indirect call unconditionally (with pointer in TLS like i386?) is a
>> > lot more efficient in code size and at least as good for performance.
>> 
>> I disagree. Doing the long vdso indirect call *necessarily* requires
>> touching a new icache line, and even a new TLB entry. Indirect branches
> 
> The increase in number of icache lines from the branch at every
> syscall point is far greater than the use of a single extra icache
> line shared by all syscalls.

That's true, I was thinking of a single function that does the test and 
calls syscalls, which might be the fair comparison.

> Not to mention the dcache line to access
> __hwcap or whatever, and the icache lines to setup access TOC-relative
> access to it. (Of course you could put a copy of its value in TLS at a
> fixed offset, which would somewhat mitigate both.)
> 
>> And finally, the HWCAP test can eventually go away in future. A vdso
>> call can not.
> 
> We support nearly arbitrarily old kernels (with limited functionality)
> and hardware (with full functionality) and don't intend for that to
> change, ever. But indeed glibc might want too eventually drop the
> check.

Ah, cool. Any build-time flexibility there?

We may or may not be getting a new ABI that will use instructions not 
supported by old processors.

https://sourceware.org/legacy-ml/binutils/2019-05/msg00331.html

Current ABI continues to work of course and be the default for some 
time, but building for new one would give some opportunity to drop
such support for old procs, at least for glibc.

> 
>> If you really want to select with an indirect branch rather than
>> direct conditional, you can do that all within the library.
> 
> OK. It's a little bit more work if that's not the interface the kernel
> will give us, but it's no big deal.

Okay.

Thanks,
Nick

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.