Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGWvnymFQ69Eh4Ji0DP5qv7f6LhTyxC9m4OSfhPuD4B1yTsP-Q@mail.gmail.com>
Date: Mon, 6 Dec 2021 20:15:48 -0500
From: David Edelsohn <dje.gcc@...il.com>
To: musl@...ts.openwall.com
Cc: Florian Weimer <fweimer@...hat.com>, Stijn Tintel <stijn@...ux-ipv6.be>
Subject: Re: [PATCH] ppc64: check for AltiVec in setjmp/longjmp

On Mon, Dec 6, 2021 at 7:59 PM Rich Felker <dalias@...c.org> wrote:
>
> On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote:
> > * Stijn Tintel:
> >
> > > diff --git a/src/setjmp/powerpc64/setjmp.s b/src/setjmp/powerpc64/setjmp.s
> > > index 37683fda..32853693 100644
> > > --- a/src/setjmp/powerpc64/setjmp.s
> > > +++ b/src/setjmp/powerpc64/setjmp.s
> > > @@ -69,7 +69,17 @@ __setjmp_toc:
> > >     stfd 30, 38*8(3)
> > >     stfd 31, 39*8(3)
> > >
> > > -   # 5) store vector registers v20-v31
> > > +   # 5) store vector registers v20-v31 if hardware supports AltiVec
> > > +   mflr 0
> > > +   bl 1f
> > > +   .hidden __hwcap
> > > +   .long __hwcap-.
> > > +1: mflr 4
> >
> > This de-balances the return stack and probably has quite severe
> > performance impact.  The ISA manual says to use
> >
> >   bcl 20,31,$+4
> >
> > and you'll have to store the __hwcap offset somewhere else.
>
> To begin with, let's change the .s files to .S files and put the whole
> branch logic inside #ifndef __ALTIVEC__ so that it does not impact
> normal builds with an ISA level where Altivec can be assumed to be
> present.
>
> I'm not sufficiently familiar with the PowerPC ISA to know how bcl
> works, but if there's a less expensive solution along those lines
> that's compatible with all ISA levels, by all means let's use it. The
> same could be done for powerpc-sf (32-bit) and its SPE branches, too.

bl = branch and link
bcl = branch conditional and link

link means place the next instruction address in the link register.
Normally a branch and link would be used for a matching "return"
instruction, but in this case it is being used to compute a position
independent code address.  As Florian correctly points out, the "bl"
will corrupt the link stack in the processor used to predict return
addresses and the recommended sequence is the one that he suggests.

bcl 20,31,addr

which means branch always and, because the condition register bits are
irrelevant, a special value that instructs the processor to not  push
the address onto the link stack so that the "calls" and "returns"
remain matched.

Thanks, David

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.