Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <A39FD0AF-E6C1-4FCA-B800-EAFB26E21E27@gmail.com>
Date: Mon, 14 Sep 2015 05:58:42 -0700
From: Fred Wang <waffle.contest@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: Judy array


On Sep 14, 2015, at 5:46 AM, Solar Designer <solar@...nwall.com> wrote:

> Fred,
> 
> On Sun, Sep 13, 2015 at 08:28:08PM -0700, Fred Wang wrote:
>> For the most part (and in particular, when Bloom filters are very sparse), they are always a win for "our" type of lookup.  In cracking, the vast majority of hashes will fail, and that is what I optimized for.
> 
> Do you mean a Bloom filter is always a win over a simple bitmap for the
> same limited RAM usage?  If so, that's no surprise.  Or do you mean it's
> always a win regardless of RAM usage, when optimizing for speed only?
> If so, that's unexpected to me except when a significant portion fits in
> a cache or when you optimize the order in which you perform the lookups
> to achieve a higher cache hit rate (or when you manage to use multiple
> bits per lookup at application level, but that's probably not what you
> meant above yet).

My focus has always been large number of hashes, and thus my statements. By large, I mean > 1,000,000 unsolved hashes (I typically run with about 80,000,000 unsolved). 

As soon as I can, I issue _mm_prefetch(bloom_target,_MM_HINT_NTA), since it is likely that the bloom filter address will not be re-used in the near term.  By the time the bloom filter is called, the address is in cache, so I don't stall waiting for it. By "as soon as I can", I mean as I finish the first 32 bits of the MD5, for example.

If you are considering only a (very) small number of hashes, then yes, bitmaps could fit in cache and thus be faster.  Because I have sufficient work to do before I check the bloom filter results, however, there appears to be little penalty.

Either way, it is easy enough to test out for various sizes of unsolved hashes.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.