musl - Re: Query regarding malloc if statement

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170620041429.zjmzwpeyycwwpcvr@voyager>
Date: Tue, 20 Jun 2017 06:14:29 +0200
From: Markus Wichmann <nullplan@....net>
To: musl@...ts.openwall.com
Subject: Re: Query regarding malloc if statement

On Mon, Jun 19, 2017 at 09:02:00PM +0000, Jamie Mccrae wrote:
> My understanding is that doing a read followed by a possible write is slower than always doing a write for the reason that upon doing a read the process will halt
> until the memory is brought into the CPU's cache which isn't a problem when just doing a write. I've just thrown together a simple application to test this (testing on a modern PC running alpine linux 64-bit in a virtualbox VM with 512MB RAM and 1 CPU core) with a normal musl library and a modified one whereby I've removed the 'if' check:
> 

Woah, you're mixing up a few things here. A cache miss and a page fault
are two very different things.

Besides, doesn't a cache miss on write mean that a cache-line for the
write area has to be allocated first?

> #include <time.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <stdint.h>
> 
> void TimedFunc()
> {
>     uint32_t loops = 64;
>     uint32_t *ptr;
>     while (loops > 0)
>     {
>         ptr = calloc(64, 2);
>         free(ptr);
>         --loops;
>     }
> }
> 
> void main()
> {
>     clock_t stime, etime;
>     stime = clock();
> 
>     uint32_t runs = 0;
>     while (runs < 16384)
>     {
>         TimedFunc();
>         ++runs;
>     }
> 
>     etime = clock();
>     printf("%d loops in %d ms\r\n", runs, ((etime - stime) * 1000 / CLOCKS_PER_SEC));
> }
> 

Hmm... looks about right (except for "void main", but let's not be
pedantic here). But, as I said, the whole thing only works if brk() is
disabled. If you don't want to recompile your kernel, you can use a
seccomp filter to disallow that system call. This forces musl to fall
back to allocating heap with mmap().

Also, you are allocating 128 bytes, which is too small to trigger the
effect. Try 100kB (if my maths did not fail me, for a 32-bit platform
the mmap threshold is at 112kB, and for a 64-bit platform it is twice
that, so 100kB is well below that).
> 
> Results are 74-148ms for the normal library and 70-72ms when the if statement is removed (about twice as fast). I've also got am original raspberry pi with a single CPU and have alpine linux on that so I've performed the same test using 32 loops, calloc(32, 2) and 8192 loops instead and see a similar result although it's much closer 411-412ms for the normal library and 405-408ms when the if statement is removed.

Interesting. So it appears to not be beneficial, time-wise, for small
allocations.

> Surely a page fault will occur when attempting to read memory not writing it, it doesn't need to bring the page into the cache if no read is taking place therefore a page fault will not occur?

No, not really. See, if Linux is doing the right thing, then it will
always have a zero page handy. If an application requests memory via
mmap() with anonymous pages, what Linux should do is write into the page
tables in the CPU-facing bytes that the pages exist and all point to the
zero page and are read-only. In the OS-facing bits, it needs to record
that those pages are copy-on-write, of course. Then a read of those
pages will return bytes from the zero page (so always zero), and a write
will cause a page fault.

Linux will of course handle that page fault by allocating a fresh
physical page and copying the zero page there and rewriting the page
tables and invalidating the page table cache. Before continuing the
program.

Of course, I don't know if Linux really does that. It might just answer
a request for memory with completely inaccessible pages that cause a
fault as soon as they are accessed in any way. The interface would be
fulfilled either way.

Oh, and the CPU cache doesn't have anything to do with this. The page
fault mechanism is so slow that a cache miss or two make no odds here.

Ciao,
Markus
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.