Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9293be85241d49c182e614ffd7186bca@AcuMS.aculab.com>
Date: Thu, 6 Feb 2020 16:27:46 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Jann Horn' <jannh@...gle.com>, Kees Cook <keescook@...omium.org>
CC: Kristen Carlson Accardi <kristen@...ux.intel.com>, Thomas Gleixner
	<tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov
	<bp@...en8.de>, "H . Peter Anvin" <hpa@...or.com>, Arjan van de Ven
	<arjan@...ux.intel.com>, Rick Edgecombe <rick.p.edgecombe@...el.com>, "the
 arch/x86 maintainers" <x86@...nel.org>, kernel list
	<linux-kernel@...r.kernel.org>, Kernel Hardening
	<kernel-hardening@...ts.openwall.com>
Subject: RE: [RFC PATCH 06/11] x86: make sure _etext includes function
 sections

From: Jann Horn
> Sent: 06 February 2020 13:16
...
> > I cannot find evidence for
> > what function start alignment should be.
> 
> There is no architecturally required alignment for functions, but
> Intel's Optimization Manual
> (<https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-
> optimization-manual.pdf>)
> recommends in section 3.4.1.5, "Code Alignment":
> 
> | Assembly/Compiler Coding Rule 12. (M impact, H generality)
> | All branch targets should be 16-byte aligned.
> 
> AFAIK this is recommended because, as documented in section 2.3.2.1,
> "Legacy Decode Pipeline" (describing the frontend of Sandy Bridge, and
> used as the base for newer microarchitectures):
> 
> | An instruction fetch is a 16-byte aligned lookup through the ITLB
> and into the instruction cache.
> | The instruction cache can deliver every cycle 16 bytes to the
> instruction pre-decoder.
> 
> AFAIK this means that if a branch ends close to the end of a 16-byte
> block, the frontend is less efficient because it may have to run two
> instruction fetches before the first instruction can even be decoded.

See also The microarchitecture of Intel, AMD and VIA CPUs from www.agner.org/optimize 

My suspicion is that reducing the cache size (so more code fits in)
will almost always be a win over aligning branch targets and entry points.
If the alignment of a function matters then there are probably other
changes to that bit of code that will give a larger benefit.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.