|
Message-Id: <A39FD0AF-E6C1-4FCA-B800-EAFB26E21E27@gmail.com> Date: Mon, 14 Sep 2015 05:58:42 -0700 From: Fred Wang <waffle.contest@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Judy array On Sep 14, 2015, at 5:46 AM, Solar Designer <solar@...nwall.com> wrote: > Fred, > > On Sun, Sep 13, 2015 at 08:28:08PM -0700, Fred Wang wrote: >> For the most part (and in particular, when Bloom filters are very sparse), they are always a win for "our" type of lookup. In cracking, the vast majority of hashes will fail, and that is what I optimized for. > > Do you mean a Bloom filter is always a win over a simple bitmap for the > same limited RAM usage? If so, that's no surprise. Or do you mean it's > always a win regardless of RAM usage, when optimizing for speed only? > If so, that's unexpected to me except when a significant portion fits in > a cache or when you optimize the order in which you perform the lookups > to achieve a higher cache hit rate (or when you manage to use multiple > bits per lookup at application level, but that's probably not what you > meant above yet). My focus has always been large number of hashes, and thus my statements. By large, I mean > 1,000,000 unsolved hashes (I typically run with about 80,000,000 unsolved). As soon as I can, I issue _mm_prefetch(bloom_target,_MM_HINT_NTA), since it is likely that the bloom filter address will not be re-used in the near term. By the time the bloom filter is called, the address is in cache, so I don't stall waiting for it. By "as soon as I can", I mean as I finish the first 32 bits of the MD5, for example. If you are considering only a (very) small number of hashes, then yes, bitmaps could fit in cache and thus be faster. Because I have sufficient work to do before I check the bloom filter results, however, there appears to be little penalty. Either way, it is easy enough to test out for various sizes of unsolved hashes.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.