john-dev - split() and str_alloc

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <20150504120716.GA11692@openwall.com>
Date: Mon, 4 May 2015 15:07:16 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: split() and str_alloc_copy()

I've read split() in dummy.c:

static char *split(char *ciphertext, int index, struct fmt_main *self)
{
	// canonical fix for any hash with embedded null.
	char *cp;
	if (strncmp(ciphertext, FORMAT_TAG, FORMAT_TAG_LEN))
		return ciphertext;
	cp = &ciphertext[FORMAT_TAG_LEN];
	while (cp[0] && cp[1]) {
		if (cp[0] == '0' && cp[1] == '0') {
			char *cp2 = str_alloc_copy(ciphertext);
			cp2[cp-ciphertext] = 0;
			return cp2;
		}
		cp += 2;
	}
	return ciphertext;
}

str_alloc_copy() calls mem_alloc_tiny() that tracks memory. I assume
that it cleans memory up in the very end. I saw that other formats use
static buffers to return result from split() so I assume that john
does not release the buffer after split(). So I assume that split() in
dummy.c calls str_alloc_copy() that holds memory till the very end.
It occurs only if 00 is in the hash.

Let's experiment:

$ perl -le 'print q{$dummy$} . join "", map { sprintf "%02x", rand(255) + 1} 1 .. 95 for 1 .. 100000' > t.pw
$ perl -le 'print q{$dummy$} . join "", map { $_ == 10 ? "00" : sprintf("%02x", rand(255) + 1)} 1 .. 95 for 1 .. 100000' > t2.pw

So t.pw has 100k hashes without 00 and t2.pw has 100k hashes with 00
at 10th position.

$ ./JohnTheRipper/run/john t.pw
% ps -A -o pid,size,cmd | grep 'john t2\?\.pw'
16376 254396 ./JohnTheRipper/run/john t.pw

$ ./JohnTheRipper/run/john t2.pw
% ps -A -o pid,size,cmd | grep 'john t2\?\.pw'
16531 257476 ./JohnTheRipper/run/john t2.pw

254396 vs 257476 is ~3mb difference. That's too small for my
assumptions: I expected that the difference would be 2x. What do I
miss?

Thanks!

-- 
Regards,
Aleksey Cherepanov

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.