|
Message-ID: <20191215185125.GB1666@brightrain.aerifal.cx> Date: Sun, 15 Dec 2019 13:51:25 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: max_align_t mess on i386 On Sun, Dec 15, 2019 at 07:23:14PM +0100, Joakim Sindholt wrote: > On Sun, Dec 15, 2019 at 01:06:29PM -0500, Jeffrey Walton wrote: > > On Sat, Dec 14, 2019 at 10:19 AM Rich Felker <dalias@...c.org> wrote: > > > > > > In reserching how much memory could be saved, and how practical it > > > would be, for the new malloc to align only to 8-byte boundaries > > > instead of 16-byte on archs where alignof(max_align_t) is 8 (pretty > > > much all 32-bit archs), I discovered that GCC quietly changed its > > > idead of i386 max_align_t to 16-byte alignment in GCC 7, to better > > > accommodate the new _Float128 access via SSE. Presumably (I haven't > > > checked) the change is reflected with changes in the psABI document to > > > make it "official". > > > > Be careful with policy changes like this. The malloc (3) man page says: > > > > The malloc() and calloc() functions return a pointer to the > > allocated memory that is suitably aligned for any kind of variable. > > Your man pages are not the standard, but the standard does have this to > say: > > The pointer returned if the allocation succeeds shall be suitably > > aligned so that it may be assigned to a pointer to any type of object > > and then used to access such an object in the space allocated (until the > > space is explicitly freed or reallocated). > > To me this sounds like my next suggestion is technically disallowed. > > > I expect to be able to use a pointer returned by malloc (and friends) > > in MMX, SSE and AVX functions. > > I might agree, but would it not be feasible to have the alignment of the > returned pointer be dependent on the size of the allocation? That way, > if you allocate <16 bytes you can get 8 byte alignment. You might even > be able to go all the way down to 4 byte alignment for <8 byte > allocations. This is a nice idea and the bump allocator (simple_malloc) in musl for static-linked programs that don't use free does pretty much exactly that. With a nontrivial allocator it gets more complicated though, and I don't think there's any way to take advantage of this with the new malloc. For example, in the new allocator with 4-byte inband slot headers, 16-byte slots don't need 16-byte alignment because the largest object they can hold is 12 bytes, and the largest alignment such an object can need is 8-byte. However, since they're spaced 16 bytes apart, there's no advantage to being able to misalign them mod 16; as long as the first one in a run is aligned, all of them are. The same would apply if we had 8-byte slots, but those are mostly uninteresting with 4 bytes taken for headers. Taking advantage of it with dlmalloc-type designs that don't involve evenly-spaced slots is perhaps more practical, but can lead to messy split/merge since the small underaligned chunks aren't starting on valid boundaries to merge with adjacent free chunks. I think they'll tend to eventually get tied up as unusable space at the bottom of adjacent chunks, unnecessarily limiting the size of the allocations just below them. > It might violate the standard technically speaking, but I don't know of > any examples of types smaller than 16 bytes that require 16 byte > alignment. It doesn't since no object can have size smaller than its alignment. (As long as pointer types aren't lossy; if some pointer types lost low bits, then it would be non-conforming.) Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.