|
Message-ID: <063599eb3397cd4d5ab845b3c0b75010@smtp.hushmail.com> Date: Tue, 25 Aug 2015 18:00:03 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: LWS and GWS auto-tuning On 2015-08-25 13:42, Solar Designer wrote: > The attached patch #if 0's opencl_find_best_workgroup() (perhaps we > need to drop it completely, and remove from common-opencl.h too), and > revises and makes use of opencl_find_best_lws(). > > The new logic is, when neither GWS nor LWS env vars are specified: > pre-tune GWS (with a lower than usual maximum), tune LWS, and finally > tune GWS with the tuned LWS and considering the queried number of > compute units. Obviously, this is far from perfect - we're trying to > find a maximum of a function of two variables, but are adjusting only > one at a time. Yet it appears to work much better than the current > approach of tuning GWS only. Aye sir. The boost from this is better than I thought. The difference is sometimes 2x and more! Here's my laptop top/bottom 10: Ratio: 0.83601 real, 0.40000 virtual sha1crypt-opencl, (NetBSD):Raw Ratio: 0.85271 real, 0.47251 virtual encfs-opencl, EncFS:Raw Ratio: 0.86628 real, 0.07211 virtual wpapsk-opencl, WPA/WPA2 PSK:Raw Ratio: 0.87242 real, 0.78965 virtual Raw-SHA512-opencl:Raw Ratio: 0.87999 real, 0.75624 virtual Raw-SHA256-opencl:Raw Ratio: 0.89727 real, 0.23077 virtual krb5pa-sha1-opencl, Kerberos 5 AS-REQ Pre-Auth etype 17/18:Raw Ratio: 0.89920 real, 1.10118 virtual mysql-sha1-opencl, MySQL 4.1+:Raw Ratio: 0.94276 real, 0.06511 virtual PBKDF2-HMAC-SHA1-opencl:Raw Ratio: 0.94495 real, 0.98183 virtual mscash-opencl, M$ Cache Hash:Only one salt ... Ratio: 1.21359 real, 1.00000 virtual md5crypt-opencl, crypt(3) $1$:Raw Ratio: 1.27538 real, 2.72273 virtual RAKP-opencl, IPMI 2.0 RAKP (RMCP+):Only one salt Ratio: 1.28331 real, 0.50000 virtual PBKDF2-HMAC-SHA512-opencl, GRUB2 / OS X 10.8+, rounds=10000:Raw Ratio: 1.32295 real, 1.12515 virtual zip-opencl, ZIP:Raw Ratio: 1.33187 real, 8.22860 virtual RAKP-opencl, IPMI 2.0 RAKP (RMCP+):Many salts Ratio: 2.00808 real, 27.61671 virtual krb5pa-md5-opencl, Kerberos 5 AS-REQ Pre-Auth etype 23:Many salts Ratio: 2.35028 real, 168.85266 virtual oldoffice-opencl, MS Office <= 2003:Many salts Ratio: 2.51140 real, 31.75871 virtual oldoffice-opencl, MS Office <= 2003:Only one salt Ratio: 2.54982 real, 4.06557 virtual lotus5-opencl, Lotus Notes/Domino 5:Raw Ratio: 2.55546 real, 23.30411 virtual krb5pa-md5-opencl, Kerberos 5 AS-REQ Pre-Auth etype 23:Only one salt The worst ones I'm pretty sure are false/coincidental. I manually re-tested some of the best ones and it's true: They really auto-tune to 2.5x faster speed than before. I'm currently testing Tahiti/Titan on super. On another note, AMD's 15.7 driver seem to have re-gained RAR and other formats' speed (even before the autotune changes). It was way slower for several versions of the driver (including 14.9 which is our best tested version ever). magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.