Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150917160930.GA16815@openwall.com>
Date: Thu, 17 Sep 2015 19:09:30 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: bitmap & hash table prefetching (was: Judy array)

On Wed, Sep 16, 2015 at 04:28:42AM +0300, Solar Designer wrote:
> Implemented better prefetching, patch attached.

FWIW, the patch that I posted, which works best for me and which is
now committed, prefetches only bitmap elements and then hash table
elements for which the bitmap indicates a hit.  It does not prefetch
"struct db_password *pw" (from which we're using the "binary" and
"next_hash" fields), nor pw->binary (I mean the actual binary here,
whereas the "binary" field is just a pointer).  I've tried adding those
two additional prefetch levels yesterday (for a total of four), but got
a performance regression.  I did not try re-tuning CRK_PREFETCH, though
(it was still at 64).

Naturally, this reminds me that we could want to keep (a portion of?)
binary inline in "struct db_password" rather than (only) referenced from
there, or/and we could even use an inter-mixed bitmap/hash/binary data
structure - e.g., 128 bits of bitmap followed by a 64-bit hash table
entry (a pointer) and first 64 bits of binary, for a total of 256 bits
(still a power of 2, so easy to index).  Then the bitmap entry fetch
would likely also fetch the hash table entry and the partial binary
(since all of the 256 bits likely fall into the same cache line).
On 32-bit, we could even squeeze the next_hash field in there (two
32-bit pointers instead of one 64-bit).  Or something like this.
A drawback is that bitmap and hash table sizes become more closely
related, not letting us easily tune them separately.  The exact mix I
suggested above might not be optimal from that point of view.
Regardless, I have no time to try this approach out now.

The extra two levels of prefetching from my experiment described above
looked like this:

		if (!lucky) {
			index = target;
			continue;
		}
		for (slot = 0, ahead = index; ahead < lucky; slot++, ahead++)
		if (a[slot].u.p) {
			struct db_password *pw = *a[slot].u.p;
/*
 * Chances are this will also prefetch the "binary" field, which is located
 * right after next_hash.
 */
#ifdef __SSE2__
			_mm_prefetch((const char *)&pw->next_hash, _MM_HINT_NTA);
#else
			*(void * volatile *)&pw->next_hash;
#endif
		}
		for (slot = 0, ahead = index; ahead < lucky; slot++, ahead++)
		if (a[slot].u.p) {
			struct db_password *pw = *a[slot].u.p;
#ifdef __SSE2__
			_mm_prefetch((const char *)pw->binary, _MM_HINT_NTA);
#else
			*(void * volatile *)pw->binary;
#endif
		}

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.