Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ86T=UpydiEX9hBK-UYnvWREeo2xjnyjDjs3mW+wcuh4dj-mw@mail.gmail.com>
Date: Mon, 18 May 2015 12:35:55 -0700
From: Andre McCurdy <armccurdy@...il.com>
To: musl@...ts.openwall.com
Subject: Re: Eliminating preference for avoiding thread pointer? Cost
 on MIPS?

On Sat, May 16, 2015 at 9:48 AM, Rich Felker <dalias@...c.org> wrote:
> On Sat, May 16, 2015 at 09:33:20AM -0700, Isaac Dunham wrote:
>> On Fri, May 15, 2015 at 11:55:44PM -0400, Rich Felker wrote:
>> > Traditionally, musl has gone to pretty great lengths to avoid
>> > depending on the thread pointer. The original reason was that it was
>> > not always initialized, and when it was, the init was lazy. This
>> > resulted in a lot of cruft, where we would have lots of constructs of
>> > the form:
>> >
>> >     bar = some_predicate ? __pthread_self()->foo : global_foo
>> >
>> > or similar. Being that these predicates depend(ed) on globals, they
>> > were/are rather expensive in position-independent code on most archs.
>> > Now that the thread pointer is always initialized at startup (since
>> > 1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
>> > HCF if it fails), this seems to be an unnecessary cost. Not only does
>> > it cost cycles; it also has a complexity cost in terms of code to
>> > maintain the state of the predicates (e.g. the atomics for locale
>> > state) and in terms of libc-internal assumptions. So I'd like to just
>> > use the thread pointer directly wherever it makes sense, and take
>> > advantage of the fact that we have it.
>> >
>> > Unfortunately, there's one arch where thread-pointer access may be
>> > prohibitively costly: old MIPS. On the MIPS o32 ABI, the thread
>> > pointer is accessed via the "rdhwr $3,$29" instruction, which was only
>> > introduced in MIPS32rev2. MIPS-I, MIPS-II, and possibly the original
>> > MIPS32 lack it, and while Linux has a "fast path" trap to emulate it,
>> > I'm not clear on how "fast" it is.
>> >
>> > First, I'd like to find out how slow this trap is. If it's something
>> > like 150 cycles, that's ugly but probably acceptable. If it's more
>> > like 1000 cycles, that's a big problem. If anyone can run the attached
>> > test program on real MIPS-I or MIPS-II hardware and give me the
>> > results, please do! Compile it once with -O3 -DDO_RDHWR and once with
>> > just -O3 and send the (one-line) output of both to the list. It
>> > doesn't matter what libc your MIPS system is using -- any should be
>> > fine, but you might need to link with -lrt on glibc or uclibc.
>>
>> dd-wrt micro on a WRT54Gv8.0:
>> \u@\h:\w\$ cat /proc/version
>> Linux version 2.4.37 (root@...wrt) (gcc version 3.4.6 (OpenWrt-2.0)) #13303 Thu Aug 12 04:47:54 CEST 2010

It looks like rdhwr emulation was first added in linux 2.6.15, so
2.4.37 is likely too old to run this test?

>> \u@\h:\w\$ wget http://192.168.2.114:8080/def-bin
>> Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
>> \u@\h:\w\$ echo *
>> def-bin
>> \u@\h:\w\$ chmod +x def-bin
>> \u@\h:\w\$ ./def-bin
>> 0 0.016751000
>> \u@\h:\w\$ wget http://192.168.2.114:8080/rd-bin
>> Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
>> \u@\h:\w\$ chmod +x rd-bin
>> \u@\h:\w\$ ./rd-bin
>> Illegal instruction
>>
>> def-bin is withou -DDO_RDHWR, rd-bin is with.
>> Both compiled static with musl 1.1.6 (because that's the latest musl-cross
>> toolchain) and stripped.
>>
>> free reports 448 kb of 5736 kb free. (In other words, there's a reason it's
>> that stripped down.)
>
> Bleh, it looks like they intentionally broke their kernel to save a
> few bytes... I don't think it's possible to support such
> configurations, at least not reasonably.
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.