musl - Re: mallocng progress and growth chart

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3bc9cd46-4deb-819d-3677-7c2f5ffe237a@wwcom.ch>
Date: Thu, 4 Jun 2020 09:04:47 +0200
From: Pirmin Walthert <pirmin.walthert@...om.ch>
To: musl@...ts.openwall.com
Subject: Re: mallocng progress and growth chart


Am 25.05.20 um 20:13 schrieb Pirmin Walthert:
> Am 25.05.20 um 19:54 schrieb Rich Felker:
>
>> On Mon, May 25, 2020 at 05:45:33PM +0200, Pirmin Walthert wrote:
>>> Am 18.05.20 um 20:53 schrieb Rich Felker:
>>>
>>>> On Sat, May 16, 2020 at 11:30:25PM -0400, Rich Felker wrote:
>>>>> Another alternative for avoiding eagar commit at low usage, which
>>>>> works for all but nommu: when adding groups with nontrivial slot 
>>>>> count
>>>>> at low usage, don't activate all the slots right away. Reserve vm
>>>>> space for 7 slots for a 7x4672, but only unprotect the first 2 pages,
>>>>> and treat it as a group of just 1 slot until there are no slots free
>>>>> and one is needed. Then, unprotect another page (or more if needed to
>>>>> fit another slot, as would be needed at larger sizes) and adjust the
>>>>> slot count to match. (Conceptually; implementation-wise, the slot
>>>>> count would be fixed, and there would just be a limit on the 
>>>>> number of
>>>>> slots made avilable when transformed from "freed" to "available" for
>>>>> activation.)
>>>>>
>>>>> Note that this is what happens anyway with physical memory as clean
>>>>> anonymous pages are first touched, but (1) doing it without explicit
>>>>> unprotect over-counts the not-yet-used slots for commit charge
>>>>> purposes and breaks tightly-memory-constrained environments (global
>>>>> commit limit or cgroup) and (2) when all slots are initially 
>>>>> available
>>>>> as they are now, repeated free/malloc cycles for the same size will
>>>>> round-robin all the slots, touching them all.
>>>>>
>>>>> Here, property (2) is of course desirable for hardening at 
>>>>> moderate to
>>>>> high usage, but at low usage UAF tends to be less of a concern
>>>>> (because you don't have complex data structures with complex 
>>>>> lifetimes
>>>>> if you hardly have any malloc).
>>>>> c
>>>>> Note also that (2) could be solved without addressing (1) just by
>>>>> skipping the protection aspect of this idea and only using the
>>>>> available-slot-limiting part.
>>>> One abstract way of thinking about the above is that it's just a
>>>> per-size-class bump allocator, pre-reserving enough virtual address
>>>> space to end sufficiently close to a page boundary that there's no
>>>> significant memory waste. This is actually fairly elegant, and might
>>>> obsolete some of the other measures taken to avoid overly eagar
>>>> allocation. So this might be a worthwhile direction to pursue.
>>> Dear Rich,
>>>
>>> Currently we use mallocng in production for most applications in our
>>> "embedded like" virtualised system setups, it even helped to find
>>> some bugs (for example in asterisk) as mallocng was less forgiving
>>> than the old malloc implementation. So if you're interested in real
>>> world feedback: everything seems to be running quite smoothly so
>>> far, thanks for this great work.
>>>
>>> Currently we use the git version of April 24th, so the version
>>> before you merged the huge optimization changes. As you mentioned in
>>> your "brainstorming mails", if I got them right, that you might
>>> rethink a few of these changes, I'd like to ask: do you think it
>>> would be better to use the current git-master version rather than
>>> the version of April 24th (we are not THAT memory constrained, so
>>> stability is the most important thing) or do you think it would be
>>> better to stick on the old version and wait for the next changes to
>>> be merged?
>> Thanks for the feedback!
>>
>> Which are the "huge optimization changes" you're wondering about?
>> Indeed there's a large series of commits after the version you're
>> using but I think you're possibly misattributing them.
>>
>> A number of the commits are bug fixes -- mostly not for hard bugs, but
>> for unwanted and unintended behaviors:
>>
>> a709dde fix unexpected allocation of 7x144 group in non-power-of-two 
>> slot
>> dda5a88 fix exact size tracking in memalign
>> 915a914 adjust several size classes to fix nested groups of 
>> non-power-of-2 size
>> 7acd61e allow in-place realloc when ideal size class is off-by-one
>> caca917 add support for aligned_alloc alignments 1M and over
>>
>> There were also quite a few around an idea that didn't go well and was
>> mostly reverted, but with major improvements to the original behavior:
>>
>> 5bff93c overhaul bounce counter to work with map sizes instead of 
>> size classes
>> 71262cd tune bounce counter to avoid triggering early
>> 9601aaa prevent overflow of unmap counter part of bounce counter
>> aca1f32 don't let the mmap cache limit grow unboundedly or overflow
>> 6fbee31 second partial overhaul of bounce counter system
>> 150de6e revert from map cache to old okay_to_free scheme, but improved
>> 1e972da initial conversion of bounce counting to use sequence 
>> numbers, decay
>> e3eecb1 factor bounce/sequence counter logic into meta.h
>> 6693738 account seq for individually-mmapped allocations above hard 
>> threshold
>> 4443f64 fix complete regression (malloc always fails) on 
>> variable-pagesize archs
>>
>> If you don't care about low usage, that whole change series is fairly
>> unimportant, but should be harmless. It just changes decisions about
>> choices where either choice produces as valid state for the allocator
>> but there are tradeoffs between memory usage and performance. The new
>> behavior should be better, though.
>>
>> A few commits were reordering the dependency between memalign and the
>> standard memalign-variant functions, which is a minor namespace
>> detail:
>>
>> da4c88e rename aligned_alloc.c
>> 04407f7 reverse dependency order of memalign and aligned_alloc
>> 74e6657 rename aligned_alloc source file back to its proper name
>> c990cb1 rename memalign source file back to its proper name
>>
>> A couple were hardening:
>>
>> 5bf4e92 clear group header pointer to meta when freeing groups
>> bd04c75 in get_meta, check offset against maplen (minor hardening)
>> 77cea57 add support for allocating meta areas via legacy brk
>>
>> And pretty much all the rest of the changes are tuning behavior for
>> "optimization" of some sort or another, which may be what you were
>> referring to:
>>
>> 26143c4 limit slot count growth to 25% instead of 50% of current 
>> usage in class
>> a9187f0 remove unnecessary optimization tuning flags from Makefile 
>> CFLAGS
>> 045cc04 move coarse size classing logic to malloc fast path
>> 8348a82 eliminate med_twos_tab
>> e619034 allow slot count 1 for size classes 3 mod 4 as 
>> natural/first-class
>> c9d54f4 activate coarse size classing for small classes down to 4 
>> (but not 6)
>> 44092d8 improve individual-mmap decision
>> d355eaf remove slot count reduction to 1 for size classes 1 mod 3
>> c555ebe fix off-by-one in logic to use single-slot groups
>> 9d5ec34 switch from MADV_DONTNEED to MADV_FREE for large free slots
>> 584c7aa avoid over-use of reduced-count groups due to coarse size 
>> classing
>> f9bfb0a increase threshold for 3->2 slot reduction to 16 pages
>> 20da09e disable coarse size classing for large classes (over 8k)
>>
>> I don't think any of these changes are potentially obsoleted by
>> further ideas in the above thread. I am working on delaying activation
>> of slots until they're actually needed, so that we don't dirty pages
>> we could avoid touching, but I proposed this as an alternative to
>> other more complex tricks that I didn't really like, which have not
>> been implemented and probably won't be now.
>>
>> So, in summary, I don't see any good reason not to go with latest.
>>
>> Rich
>
> Many thanks for your detailed answer. I'll give it a try then!
>
> Pirmin
>
FYI: everything working without any issues so far on about 150 systems 
(using the most current version from 27th of May). Even got around an 
OOM issue during DOS that could be reproduced with the old malloc 
implementation (maybe because of the fragmentation issue?). Server 
software running on these systems: lighttpd, php-fpm 7.3, asterisk 16, 
slapd, isc-dhcpd, dropbear

Pirmin
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.