|
Message-ID: <20220920021511.GH9709@brightrain.aerifal.cx> Date: Mon, 19 Sep 2022 22:15:12 -0400 From: Rich Felker <dalias@...c.org> To: baiyang <baiyang@...il.com> Cc: musl <musl@...ts.openwall.com> Subject: Re: Re: The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1) On Tue, Sep 20, 2022 at 09:18:04AM +0800, baiyang wrote: > > There is no hidden "size actually allocated internally". The size you > > get is the size you requested. Everything else is allocator data > > structures *outside of the object* that the caller has no entitlement > > to peek or poke at, and malloc_usable_size's return value reflects > > that. > > If I understand correctly, according to the definition of size_classes in the mallocng code: > 1. When I call `void* p = malloc(6600)`, mallocng actually allocates > more than 8100 bytes of usable space, right? No, it uses space from a size-class-8176 group (~=slab) to produce an allocation of size 6600. The *allocation* is the part that belongs to the caller. Everything else is part of the allocator data structures. > 2. According to your previous explanation, calling > malloc_usable_size(p) at this time returns 6600, right? Yes. > My question is, if malloc_usable_size(p) can directly return 8191 > (or similar actual allocated size, as other libc do) instead of > 6600, is it possible to make mallocng achieve higher performance > both in time and space? No, and the reason you said you want it to does not make sense. You seem to think that if the group stride was 8100, calling realloc might memcpy up to 8100 bytes. This is not the case. If realloc has to allocate a new object, the amount copied will be 6600 or exactly whatever the allocated object size was (or the new size, if smaller). This is the only meaningful number. You also seem to be under the impression that the work to determine that the size was 6600 and not 8100 is where most (or at least a significant portion of) the time is spent. This is also not the case. The majority of the metadata processing time is chasing pointers back to the out-of-band metadata, validating it, validating that it round-trips back, and validating various other things. Some of these could in principle be omitted at the cost of loss-of-hardening. Figuring out that the allocation is 6600 bytes, once you already know the size class and out-of-band metadata, is quite trivial and hardly takes any of the time. (It also has a few validation checks that could be omitted at the cost of loss of hardening, but these are proportionally much smaller.) Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.