|
Message-ID: <CANJ2NMPZFGwEWbgqGKY92qBQrD3ft9=+BgkzTMtXoUFhzWy62A@mail.gmail.com> Date: Sun, 24 Jun 2012 23:02:46 +0800 From: myrice <qqlddg@...il.com> To: john-dev@...ts.openwall.com Subject: Re: async key transfers to GPU (was: Weekly report 1) Solar, Samuele - On Sun, Jun 24, 2012 at 5:26 AM, Solar Designer <solar@...nwall.com> wrote: > myrice, Samuele - > While thinking of formats interface enhancements to make this more > efficient, I realized that full efficiency may already be achieved by > splitting the set of keys in only two chunks and starting transfer of > the first chunk when set_key() is called for index == > max_keys_per_crypt / 2 - 1. Do it right from that set_key() call. > Then crypt_all() will start by initiating transfer of the second chunk > and hashing of the first chunk (which may be already in GPU by that > time), and then proceed to hash the second chunk (its transfer to GPU > may complete while the first chunk is being hashed). (You'll need to > handle the special case when fewer than max_keys_per_crypt or even fewer > than max_keys_per_crypt / 2 keys are tried per crypt_all() call. Not > optimize for this case, but just make sure it works properly as well, > without real async transfers then. This is not difficult.) > I think you mean we do not use multiple streams, we only overlap the memcpyH2D with CPU code. So in crypt_all(), I will do the followings 1. copy second half of the keys to GPU 2. hash first half of the keys 3. hash second half of the keys. When 2 finished, 1 may be already done and 3 will start. I don't know whether 1 and 2 is overlapped with the same stream. I am doing this. Will let you know the result soon. Thanks myrice
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.