|
Message-ID: <20180412040408.GP3094@brightrain.aerifal.cx> Date: Thu, 12 Apr 2018 00:04:08 -0400 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [RFC PATCH] reduce severity of ldso reclaim_gaps hack On Thu, Apr 12, 2018 at 12:19:24AM +0300, Alexander Monakov wrote: > > > @@ -410,10 +412,9 @@ void *realloc(void *p, size_t n) > > > size_t newlen = n + extra; > > > /* Crash on realloc of freed chunk */ > > > if (extra & 1) a_crash(); > > > - if (newlen < PAGE_SIZE && (new = malloc(n))) { > > > - memcpy(new, p, n-OVERHEAD); > > > - free(p); > > > - return new; > > > + if (newlen < PAGE_SIZE && (new = malloc(n-OVERHEAD))) { > > > + n0 = n; > > > + goto copy_free_ret; > > > } > > > newlen = (newlen + PAGE_SIZE-1) & -PAGE_SIZE; > > > if (oldlen == newlen) return p; > > > @@ -456,34 +457,20 @@ copy_realloc: > > > /* As a last resort, allocate a new chunk and copy to it. */ > > > new = malloc(n-OVERHEAD); > > > if (!new) return 0; > > > +copy_free_ret: > > > memcpy(new, p, n0-OVERHEAD); > > > free(CHUNK_TO_MEM(self)); > > > return new; > > > } > > > > This looks like an independent change that fixes a separate > > slight-overallocation bug. Is it related? > > No, it would be nicer to commit it separately. OK. > > > +#if defined(__GNUC__) > > > +__attribute__((cold)) > > > +#endif > > > > This can't be used as-is. It would need a configure check (gcc version > > dependent) and __cold__ if we want to, but unless there's a strong > > reason to include it I'd just omit it. > > malloc.c is compiled with -O3, causing 'free' to be inlined in all local > callers. This combats code growth and should also slightly improve code > layout in the new 'free' (not that it would help much, of course). I'd like to actually do away with the -O3, so maybe now is a good time to look at that, at least for malloc. I think the main reason it was used is already covered by the always_inline, which was to avoid GOT register reloads at each call frame level on i386 and similar archs. I suspect it might turn out better to just use -O2 instead of -Os globally (possibly with -falign-*=1) than doing the per-dir/glob stuff we're doing now, especially on modern gcc versions, but -O3 might still be preferable for string functions if we want to get vectorization there. Of course I don't want to hold your patch up on this issue, but I think we can just omit the cold with a plan to investigate -O level and inlining issues soon. > > > +static void unmap_chunk(struct chunk *self) > > > +{ > > > + size_t extra = self->psize; > > > + char *base = (char *)self - extra; > > > + size_t len = CHUNK_SIZE(self) + extra; > > > + /* Crash on double free */ > > > + if (extra & 1) a_crash(); > > > + __munmap(base, len); > > > +} > > > + > > > +void free(void *p) > > > +{ > > > + if (!p) return; > > > + > > > + struct chunk *self = MEM_TO_CHUNK(p); > > > + > > > + if (IS_MMAPPED(self)) > > > + unmap_chunk(self); > > > + else > > > + bin_chunk(self); > > > +} > > > + > > > +void __malloc_donate(char *start, char *end) > > > +{ > > > > It's confusing to see that this is equivalent to what's being removed > > from dynlink.c, but I think it may be correct. > > > > > + if (end - start < 2*OVERHEAD + SIZE_ALIGN) return; > > > > This does not seem like a sufficient bound to ensure the block is > > usable, but the next check after alignment may cover it. > > Yes, this is to ensure that we don't create invalid pointers when aligning, OK. > > > + start += OVERHEAD + SIZE_ALIGN - 1; > > > + start -= (uintptr_t)start & (SIZE_ALIGN - 1); > > > > This looks correct. > > > > > + end -= (uintptr_t)end & (SIZE_ALIGN - 1); > > > > This does not subtract the OVERHEAD, but I think it's just a > > notational difference; my "end" pointed to the end of the chunk to be > > freed, and your "end" points to the beginning of the next > > (nonexistent) chunk. The MEM_TO_CHUNK below should compensate. > > > > > + if (end - start < OVERHEAD + SIZE_ALIGN) return; > > > > At this point, start and end both point just past a chunk header, > > meaning they have to differ by a multiple of SIZE_ALIGN. I don't see > > why OVERHEAD is needed here too. The check should be against > > SIZE_ALIGN I think (although by alignment they're equivalent). I think I was wrong about them being equivalent, but... > I don't recall if I had a specific reason to spell it like that. ...end-start==SIZE_ALIGN is certainly a valid free chunk size (yielding SIZE_ALIGN-OVERHEAD == OVERHEAD bytes of storage), albeit not terribly helpful. > > > + if (end - start >= MMAP_THRESHOLD) return; > > > > This does not seem necessary. Free chunks in the last bin can be > > larger than MMAP_THRESHOLD; they're just broken up to satisfy > > allocations. Of course it's unlikely to happen anyway. > > Do such oversized chunks appear in normal operation? This seems non-obvious, > so a comment pointing that out would probably be helpful. The only way I could see it happening is on an arch ABI that allows very large pages (and has the ELF load segments aligned accordingly, as x86_64 does). In this case if the kernel/hardware only supported large (e.g. 2MB) pages, you'd pretty much always end up with >1.5MB of reclaimed space per DSO. IMO this is an awful kernel/hardware constraint to have, very wasteful, but it's exactly the situation where you'd most care about the gaps getting reclaimed for something useful. Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.