|
Message-ID: <64e5cef4d3e50a73335ecac38b65c317@smtp.hushmail.com> Date: Mon, 21 Apr 2014 23:56:03 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: mmap(2) in wordlist.c On 2014-04-21 21:22, Solar Designer wrote: > On Mon, Apr 21, 2014 at 02:34:13AM +0200, magnum wrote: >> Is there any reason for not using Posix memory mapping for wordlist >> mode's memory buffer in Jumbo? It should make it simpler and a lot more >> effective for concurrent sessions including but not limited to forked >> ones. Or is there a problem I fail to see? > > I think this is worth a try, while also keeping the present approach > (for Windows, and for systems where it's faster). That clashes with a hidden agenda though: A total rewrite of wordlist.c starting with core 1.8 version, and re-implementing Jumbo features as cleanly and KISS as possible. Unless real-world performance in some situations end up a lot worse than today's, I'd like to drop the old buffering altogether, including the --mem-file-size option which ideally should not be needed. But it might be very hard to beat the current code. We have the once MPI-specific stuff that buffers "my" words only, now also applying to node/fork (btw this mitigates the 32x example I gave in the other post - but only for forked sessions). We also have non-consecutive dupes rejection applied before buffering so once everything is loaded, we don't need to leapfrog nor check for dupes - we just read next word from an array. Very effective, but also very ugly code because features were bolted on over time. Atom's fgets-sse would obviously not speed up the current code a bit (when buffer is used) except at initial load. OTOH if I go with my latest thoughts about using a function similar to fgetl() except it reads the mapped buffer, it could/should use SSE. But it can probably never be as fast as the current buffer code except when we really benefit from not filling memory with duplicate copies of huge wordlists. > I briefly experimented with read() vs. mmap() while developing popa3d, > in 1998 or 1999. At the time, read() turned out to be significantly > faster, on Linux 2.0.x and ext2fs, when the read buffer size is properly > tuned. I didn't try this experiment in JtR. What order of size was that at the time? magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.