Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b495f8b2-c7fe-37d4-5611-0cc8b7664e0b@davidgf.es>
Date: Sun, 26 Nov 2017 02:40:24 +0100
From: David Guillen Fandos <david@...idgf.es>
To: musl@...ts.openwall.com
Subject: Re: Do not use 64 bit division if possible

On 26/11/17 02:23, Rich Felker wrote:
> On Sun, Nov 26, 2017 at 02:12:58AM +0100, David Guillen Fandos wrote:
>>
>> On 26/11/17 01:59, Rich Felker wrote:
>>> On Sun, Nov 26, 2017 at 01:49:09AM +0100, David Guillen Fandos wrote:
>>>> Hey,
>>>>
>>>> Wow that's an awesome optimization (the a&-a), didn't know gcc was
>>>> smart enough to figure that out by itself :D
>>>
>>> It doesn't seem to be doing any optimizing for me. What it *should* do
>>> is optimize the div to ctz+shift.
>>>
>>> BTW please don't top-reply; it makes threads hard to follow and hard
>>> to meaningfully reply to inline.
>>>
>>> Rich
>>>
>>>
>>>> I just realized that PAGE_SIZE seems indeed to be defined to a
>>>> constant for some architectures, did not notice since I was running
>>>> on MIPS which has a page size different for each uarch.
>>>>
>>>> I'd say the (a&-a) is a very simple optimization and we should use
>>>> it, since it adds almost no complexity and sames some cycles and
>>>> some .text bytes, which is sometimes a bit tight.
>>>>
>>>> Something like this? Doesn't hurt constants, improves some arches :)
>>>>
>>>> diff --git a/src/conf/sysconf.c b/src/conf/sysconf.c
>>>> index b8b761d0..aa9fc9d1 100644
>>>> --- a/src/conf/sysconf.c
>>>> +++ b/src/conf/sysconf.c
>>>> @@ -206,7 +206,7 @@ long sysconf(int name)
>>>> 		if (name==_SC_PHYS_PAGES) mem = si.totalram;
>>>> 		else mem = si.freeram + si.bufferram;
>>>> 		mem *= si.mem_unit;
>>>> -		mem /= PAGE_SIZE;
>>>> +		mem /= (unsigned)(PAGE_SIZE & -PAGE_SIZE);
>>>> 		return (mem > LONG_MAX) ? LONG_MAX : mem;
>>>> 		case JT_ZERO & 255:
>>>> 		return 0;
>>
>> Sorry for that, default settings you know :)
>>
>> Well the main reason is cause in MIPS it requires adding __divdi3
>> which is around 1KB of code, which hey, it's not much, but why would
>> we need it right? It makes a difference in embedded tools with
>> statically linked musl.
>>
>> Thanks for your interest!
> 
> If this is a real problem you're hitting, I'm interested in helping,
> but it seems unlikely. If your program uses printf or other common
> functions it will already be pulling in __divdi3 I think.
> 
> Rich
> 

Not a real problem really, more than binary size. I'm not using printf 
that's why I was chasing all the big-ish functions that seemed 
unnecessary in my binary and I was curious about why sysconf actually 
needed a 64 bit division on mips.

Also the (a&-a) doesnt seem to help, gcc is not that smart :) It seems 
there's no easy way to hint it that a number is a power of two, I guess 
that's why the kernel uses PAGE_SHIFT.

Given that page size is not constants in some arches it might be useful 
to have page shift, since some operations would be faster maybe? Like 
page aligning [ & ~(PAGE_SIZE-1) ]? Not sure if we care that much even 
though we use that in malloc.

Thanks for the interest!
David




Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.