musl - Re: mallocng progress and growth chart

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200525175446.GR1079@brightrain.aerifal.cx>
Date: Mon, 25 May 2020 13:54:46 -0400
From: Rich Felker <dalias@...c.org>
To: Pirmin Walthert <pirmin.walthert@...om.ch>
Cc: musl@...ts.openwall.com
Subject: Re: mallocng progress and growth chart

On Mon, May 25, 2020 at 05:45:33PM +0200, Pirmin Walthert wrote:
> Am 18.05.20 um 20:53 schrieb Rich Felker:
> 
> >On Sat, May 16, 2020 at 11:30:25PM -0400, Rich Felker wrote:
> >>Another alternative for avoiding eagar commit at low usage, which
> >>works for all but nommu: when adding groups with nontrivial slot count
> >>at low usage, don't activate all the slots right away. Reserve vm
> >>space for 7 slots for a 7x4672, but only unprotect the first 2 pages,
> >>and treat it as a group of just 1 slot until there are no slots free
> >>and one is needed. Then, unprotect another page (or more if needed to
> >>fit another slot, as would be needed at larger sizes) and adjust the
> >>slot count to match. (Conceptually; implementation-wise, the slot
> >>count would be fixed, and there would just be a limit on the number of
> >>slots made avilable when transformed from "freed" to "available" for
> >>activation.)
> >>
> >>Note that this is what happens anyway with physical memory as clean
> >>anonymous pages are first touched, but (1) doing it without explicit
> >>unprotect over-counts the not-yet-used slots for commit charge
> >>purposes and breaks tightly-memory-constrained environments (global
> >>commit limit or cgroup) and (2) when all slots are initially available
> >>as they are now, repeated free/malloc cycles for the same size will
> >>round-robin all the slots, touching them all.
> >>
> >>Here, property (2) is of course desirable for hardening at moderate to
> >>high usage, but at low usage UAF tends to be less of a concern
> >>(because you don't have complex data structures with complex lifetimes
> >>if you hardly have any malloc).
> >>c
> >>Note also that (2) could be solved without addressing (1) just by
> >>skipping the protection aspect of this idea and only using the
> >>available-slot-limiting part.
> >One abstract way of thinking about the above is that it's just a
> >per-size-class bump allocator, pre-reserving enough virtual address
> >space to end sufficiently close to a page boundary that there's no
> >significant memory waste. This is actually fairly elegant, and might
> >obsolete some of the other measures taken to avoid overly eagar
> >allocation. So this might be a worthwhile direction to pursue.
> 
> Dear Rich,
> 
> Currently we use mallocng in production for most applications in our
> "embedded like" virtualised system setups, it even helped to find
> some bugs (for example in asterisk) as mallocng was less forgiving
> than the old malloc implementation. So if you're interested in real
> world feedback: everything seems to be running quite smoothly so
> far, thanks for this great work.
> 
> Currently we use the git version of April 24th, so the version
> before you merged the huge optimization changes. As you mentioned in
> your "brainstorming mails", if I got them right, that you might
> rethink a few of these changes, I'd like to ask: do you think it
> would be better to use the current git-master version rather than
> the version of April 24th (we are not THAT memory constrained, so
> stability is the most important thing) or do you think it would be
> better to stick on the old version and wait for the next changes to
> be merged?

Thanks for the feedback!

Which are the "huge optimization changes" you're wondering about?
Indeed there's a large series of commits after the version you're
using but I think you're possibly misattributing them.

A number of the commits are bug fixes -- mostly not for hard bugs, but
for unwanted and unintended behaviors:

a709dde fix unexpected allocation of 7x144 group in non-power-of-two slot
dda5a88 fix exact size tracking in memalign
915a914 adjust several size classes to fix nested groups of non-power-of-2 size
7acd61e allow in-place realloc when ideal size class is off-by-one
caca917 add support for aligned_alloc alignments 1M and over

There were also quite a few around an idea that didn't go well and was
mostly reverted, but with major improvements to the original behavior:

5bff93c overhaul bounce counter to work with map sizes instead of size classes
71262cd tune bounce counter to avoid triggering early
9601aaa prevent overflow of unmap counter part of bounce counter
aca1f32 don't let the mmap cache limit grow unboundedly or overflow
6fbee31 second partial overhaul of bounce counter system
150de6e revert from map cache to old okay_to_free scheme, but improved
1e972da initial conversion of bounce counting to use sequence numbers, decay
e3eecb1 factor bounce/sequence counter logic into meta.h
6693738 account seq for individually-mmapped allocations above hard threshold
4443f64 fix complete regression (malloc always fails) on variable-pagesize archs

If you don't care about low usage, that whole change series is fairly
unimportant, but should be harmless. It just changes decisions about
choices where either choice produces as valid state for the allocator
but there are tradeoffs between memory usage and performance. The new
behavior should be better, though.

A few commits were reordering the dependency between memalign and the
standard memalign-variant functions, which is a minor namespace
detail:

da4c88e rename aligned_alloc.c
04407f7 reverse dependency order of memalign and aligned_alloc
74e6657 rename aligned_alloc source file back to its proper name
c990cb1 rename memalign source file back to its proper name

A couple were hardening:

5bf4e92 clear group header pointer to meta when freeing groups
bd04c75 in get_meta, check offset against maplen (minor hardening)
77cea57 add support for allocating meta areas via legacy brk

And pretty much all the rest of the changes are tuning behavior for
"optimization" of some sort or another, which may be what you were
referring to:

26143c4 limit slot count growth to 25% instead of 50% of current usage in class
a9187f0 remove unnecessary optimization tuning flags from Makefile CFLAGS
045cc04 move coarse size classing logic to malloc fast path
8348a82 eliminate med_twos_tab
e619034 allow slot count 1 for size classes 3 mod 4 as natural/first-class
c9d54f4 activate coarse size classing for small classes down to 4 (but not 6)
44092d8 improve individual-mmap decision
d355eaf remove slot count reduction to 1 for size classes 1 mod 3
c555ebe fix off-by-one in logic to use single-slot groups
9d5ec34 switch from MADV_DONTNEED to MADV_FREE for large free slots
584c7aa avoid over-use of reduced-count groups due to coarse size classing
f9bfb0a increase threshold for 3->2 slot reduction to 16 pages
20da09e disable coarse size classing for large classes (over 8k)

I don't think any of these changes are potentially obsoleted by
further ideas in the above thread. I am working on delaying activation
of slots until they're actually needed, so that we don't dirty pages
we could avoid touching, but I proposed this as an alternative to
other more complex tricks that I didn't really like, which have not
been implemented and probably won't be now.

So, in summary, I don't see any good reason not to go with latest.

Rich
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.