Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230124083747.GI4163@brightrain.aerifal.cx>
Date: Tue, 24 Jan 2023 03:37:48 -0500
From: Rich Felker <dalias@...c.org>
To: Dominique MARTINET <dominique.martinet@...ark-techno.com>
Cc: musl@...ts.openwall.com
Subject: Re: infinite loop in mallocng's try_avail

On Mon, Jan 23, 2023 at 10:33:39AM +0900, Dominique MARTINET wrote:
> Hi,
> 
> A coworker of mine has been observing hangs during shutdown on our ARM
> alpine/openrc board, where the openrc shutdown process is stuck in
> userspace after having stopped almost everything once in a blue moon
> (running reboot in a loop, this is reproducible roughly one time every
> three days)
> 
> It turns out it's stuck in malloc() as follow:
> (gdb) bt
> #0  try_avail (pm=0xb6f692e8 <__malloc_context+48>) at src/malloc/mallocng/malloc.c:156
> #1  alloc_slot (sc=sc@...ry=0, req=req@...ry=8) at src/malloc/mallocng/malloc.c:288
> #2  0xb6efdc94 in __libc_malloc_impl (n=8) at src/malloc/mallocng/malloc.c:369
> #3  0xb6e9d532 in xmalloc (size=size@...ry=8) at ../src/shared/helpers.h:64
> #4  0xb6e9d598 in rc_stringlist_new () at ../src/librc/librc-stringlist.c:24
> #5  0xb6e9b91e in get_provided (depinfo=depinfo@...ry=0xb6f585d0, runlevel=runlevel@...ry=0xbe940f99 "shutdown", options=options@...ry=3)
>     at ../src/librc/librc-depend.c:269
> #6  0xb6e9bce8 in visit_service (options=3, runlevel=0xbe940f99 "shutdown", depinfo=0xb6f5ee30, visited=0xb6f57b00, sorted=0xb6f57af0, types=0xb6f575a0, 
>     deptree=0xb6ed8ec0) at ../src/librc/librc-depend.c:419
> #7  visit_service (deptree=0xb6ed8ec0, types=0xb6f575a0, sorted=0xb6f57af0, visited=0xb6f57b00, depinfo=0xb6f5ee30, runlevel=0xbe940f99 "shutdown", options=3)
>     at ../src/librc/librc-depend.c:357
> #8  0xb6e9bcb2 in visit_service (options=3, runlevel=0xbe940f99 "shutdown", depinfo=0xb6f62f50, visited=0xb6f57b00, sorted=0xb6f57af0, types=0xb6f575a0, 
>     deptree=0xb6ed8ec0) at ../src/librc/librc-depend.c:405
> #9  visit_service (deptree=0xb6ed8ec0, types=0xb6f575a0, sorted=0xb6f57af0, visited=0xb6f57b00, depinfo=0xb6f62f50, runlevel=0xbe940f99 "shutdown", options=3)
>     at ../src/librc/librc-depend.c:357
> #10 0xb6e9bcb2 in visit_service (options=3, runlevel=0xbe940f99 "shutdown", depinfo=0xb6ed85a0, visited=0xb6f57b00, sorted=0xb6f57af0, types=0xb6f575a0, 
>     deptree=0xb6ed8ec0) at ../src/librc/librc-depend.c:405
> #11 visit_service (deptree=0xb6ed8ec0, types=0xb6f575a0, sorted=0xb6f57af0, visited=0xb6f57b00, depinfo=0xb6ed85a0, runlevel=0xbe940f99 "shutdown", options=3)
>     at ../src/librc/librc-depend.c:357
> #12 0xb6e9bffa in rc_deptree_depends (deptree=deptree@...ry=0xb6ed8ec0, types=types@...ry=0xb6f575a0, services=services@...ry=0xb6f57ab0, 
>     runlevel=0xbe940f99 "shutdown", options=options@...ry=3) at ../src/librc/librc-depend.c:482
> #13 0x004226ea in do_stop_services (going_down=true, parallel=true, newlevel=0xbe940f99 "shutdown", deptree=0xb6ed8ec0, stop_services=<optimized out>, 
>     start_services=0xb6f57d70, types_nw=0xb6f575a0) at ../src/openrc/rc.c:592
> #14 main (argc=<optimized out>, argv=<optimized out>) at ../src/openrc/rc.c:1022
> 
> (this is musl 1.2.4 with a couple of patchs, none around malloc:
> https://gitlab.alpinelinux.org/alpine/aports/-/tree/3.17-stable/main/musl
> )
> 
> For convenience, I've copied the incriminated loop here:
>         int cnt = m->mem->active_idx + 2;
>         int size = size_classes[m->sizeclass]*UNIT;
>         int span = UNIT + size*cnt;
>         // activate up to next 4k boundary
>         while ((span^(span+size-1)) < 4096) {
>                 cnt++;
>                 span += size;
>         }

This code should not be reachable for size class 0 or any size class
allocated inside a larger-size-class slot. That case has active_idx =
cnt-1 (set at line 272).

If this code is being reached, either the allocator state has been
corrupted by some UB in the application, or there's a logic bug in
mallocng. The sequence of events that seem to have to happen to get
there are:

1. Previously active group has no more available slots (line 120).

2. Freed mask of newly activating group (line 131 or 138) is either
   zero (line 145) or the active_idx (read from in-band memory
   susceptible to application buffer overflows etc) is wrong and
   produces zero when its bits are anded with the freed mask (line
   145).

> Here, we have UNIT=16 (constant), we're considering a malloc(8) so
> sizeclass is 0 -> size_classes[0] = 1 -> size = 16, so we're looping
> on this condition:
>   while (span ^ (span + 15) < 4096)
> with span being a multiple of 16.
> 
> I'm not good at maths, but I'm fairly confindent this never gets false:
> 0x0ff0 ^ 0x0fff = f
> 0x1000 ^ 0x100f = f
> It could become true without the -1 (span+16), is that was intended ?
> (It works for other sizes because of the UNIT offset in size, so
> e.g. for size=32 we have 0x0ff0 ^ 0x100f which is 8191 alright, but
> that'd also work without the -1 as far as I can tell)
> 
> 
> Anyway, I'm not ruling out a problem in openrc yet -- rc_parallel is
> known to be buggy -- but looking at the mallocng ctx I can't see any
> obvious problem so I'm asking here first:
> Is that supposed to work? Now I've gotten it down so far it might be
> possible to build a reproducer; unfortunately I do not know how ctx was
> when it entered try_avail (the function entered the if (!mask) so the
> first entry was dequeued), but here's it's current state (please ask if
> you need anything else)
> ------
> (gdb) p __malloc_context            
> $94 = {
>   secret = 15756413639004407235,
>   init_done = 1,
>   mmap_counter = 135,
>   free_meta_head = 0x0,
>   avail_meta = 0x18a3f70,
>   avail_meta_count = 6,
>   avail_meta_area_count = 0,
>   meta_alloc_shift = 0,
>   meta_area_head = 0x18a3000,
>   meta_area_tail = 0x18a3000,
>   avail_meta_areas = 0x18a4000 <error: Cannot access memory at address 0x18a4000>,
>   active = {0x18a3e98, 0x18a3eb0, 0x18a3208, 0x18a3280, 0x0, 0x0, 0x0, 0x18a31c0, 0x0, 0x0, 0x0, 0x18a3148, 0x0, 0x0, 0x0, 0x18a3dd8, 0x0, 0x0, 0x0, 0x18a3d90, 0x0, 
>     0x18a31f0, 0x0, 0x18a3b68, 0x0, 0x18a3f28, 0x0, 0x0, 0x0, 0x18a3238, 0x0 <repeats 18 times>},
>   usage_by_class = {2580, 600, 10, 7, 0 <repeats 11 times>, 96, 0, 0, 0, 20, 0, 3, 0, 8, 0, 3, 0, 0, 0, 3, 0 <repeats 18 times>},
>   unmap_seq = '\000' <repeats 31 times>,
>   bounces = '\000' <repeats 18 times>, "w", '\000' <repeats 12 times>,
>   seq = 1 '\001',
>   brk = 25837568
> }
> (gdb) p *__malloc_context->active[0]
> $95 = {
>   prev = 0x18a3f40,
>   next = 0x18a3e80,
>   mem = 0xb6f57b30,
>   avail_mask = 1073741822,
>   freed_mask = 0,
>   last_idx = 29,
>   freeable = 1,
>   sizeclass = 0,
>   maplen = 0
> }
> (gdb) p *__malloc_context->active[0]->mem
> $97 = {
>   meta = 0x18a3e98,
>   active_idx = 29 '\035',
>   pad = "\000\000\000\000\000\000\000\000\377\000",
>   storage = 0xb6f57b40 ""
> }

This is really weird, because at the point of the infinite loop, the
new group should not yet be activated (line 163), so
__malloc_context->active[0] should still point to the old active
group. But its avail_mask has all bits set and active_idx is not
corrupted, so try_avail should just have obtained an available slot
from it without ever entering the block at line 120. So I'm confused
how it got to the loop.

One odd thing I noticed is that the backtrace pm=0xb6f692e8 does not
match the __malloc_context->active[0] address. Were thse from
different runs?

> ------
> 
> 
> I'm leaving that process stuck in limbo so I can get more infos for a
> few days.
> 
> Meanwhile we'll also be trying to see if the following patch helps --
> given we do not have a good reproducer there won't be any feedback on
> this side for a while.
> ----
> >From f46bbe3117674becb3c5e4ffb7c8f4e4776084db Mon Sep 17 00:00:00 2001
> From: Dominique Martinet <dominique.martinet@...ark-techno.com>
> Date: Mon, 23 Jan 2023 10:12:55 +0900
> Subject: [PATCH] mallocng: try_avail: fix infinite loop
> 
> sizeclass=0 has span being a multiple of size = 16, which will
> never stop having "(span^(span+size-1)) < 4096".
> Removing the -1 makes that ending condition reachable for all
> class sizes.
> ---
>  src/malloc/mallocng/malloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/malloc/mallocng/malloc.c b/src/malloc/mallocng/malloc.c
> index d695ab8ec982..9ef0b176c31e 100644
> --- a/src/malloc/mallocng/malloc.c
> +++ b/src/malloc/mallocng/malloc.c
> @@ -151,7 +151,7 @@ static uint32_t try_avail(struct meta **pm)
>  				int size = size_classes[m->sizeclass]*UNIT;
>  				int span = UNIT + size*cnt;
>  				// activate up to next 4k boundary
> -				while ((span^(span+size-1)) < 4096) {
> +				while ((span^(span+size)) < 4096) {
>  					cnt++;
>  					span += size;
>  				}
> ----
> 
> Thank you,
> -- 
> Dominique Martinet

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.