|
Message-ID: <55189563.4000402@openwall.com> Date: Mon, 30 Mar 2015 03:14:27 +0300 From: Alexander Cherepanov <ch3root@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Lengths of fields recorded in hashes On 2015-03-23 05:40, Solar Designer wrote: >> Does anybody consider them useful? They could potentially be useful when >> one needs to allocate memory for the data but even in this case they >> save only one strlen (or equivalent). >> OTOH each superfluous field now requires several lines of code in >> valid() and get_salt(). And parsing numbers is not that easy, e.g. atoi >> exhibits undefined behavior on overflows. > > Right. BTW, we should get rid of all atoi()'s, like in favor of a > custom wrapper around strtol() that we'd introduce into our misc.[ch]. > > Maybe we should adopt OpenBSD's strtonum(): > > http://www.tedunangst.com/flak/post/the-design-of-strtonum > > although its use of "long long" is unnecessarily not nice for 32-bit > builds. If we deviate from strtonum() in any way, we'll need to use a > different function name. Such a function doesn't seem to be a good fit for one-pass parsing at all. It doesn't return a position where it has stopped parsing and it "doesn’t handle [...] trailing non-digit characters". This leads to strdup+strtok-like approach which I thought we want to get rid of. > This is one of many tasks for the "robustness officer" we're seeking. > >> Extra fields also take extra >> space etc. Recently the point was raised that lengths are an obstacle to >> fuzzing. And it would be nice if most of the hashes are constructed from >> a small number of simple blocks (a field containing a length of another >> field is not one of them). > > While maybe being a problem for fuzzing, this also suggests that the > fields were maybe not totally useless: they serve as poor man's > checksums, even if only for metadata. For very long encoded strings, > this makes some sense to me. Actually, I thought about it but decided that it's minor enough to not be worth mentioning. >> I propose the following plan: >> 1) gradually rewrite valid()s and get_salt()s to ignore lengths recorded >> in hashes; > > No, valid() must continue to check whatever practical. > > I am OK with rewriting get_salt(). > > BTW, officially - such as in formats.h - this function is called > salt(). And ditto for binary(). The get_ prefix, as used by many > formats, isn't exactly right: these function aren't "getters" (from the > format's state), they are "extractors" (from the function argument). > Maybe we should rename them dropping the get_ prefix, or changing it to > extract_ (but that's longer, and ext_ would conflict with the naming > we're already using for functions implementing external mode support). > For now, let's not bother - not the worst of our worries, by far. Ok, I've added it into todo list: - Unify function names (get_salt -> salt etc.). This makes grepping/refactoring easier. >> 2) change the format of hashes in john and in 2john tools at some later >> point. > > Hmm, to be compatible with your changes proposed as step 1 above, would > the versions of *2john tools from step 2 produce empty fields in place > of the lengths? That is, two field separators in a row? That's still > not pretty. > > Anyway, we probably should do this, for the reasons magnum explained. You mean to accept empty length but check that it's right if it's not empty? >> Can we change format of such hashes at will (by requiring to always use >> john and 2john tools of matching versions)? > > Not necessarily. We could if we really needed to, though. > >> Attached is a patch to 7z format which implements the first part of the >> plan as an illustration of the approach. (It also fixes a potential >> overflow in data field and removes strange dealing with zeroes in iv.) > > I didn't review it closely, but in general I think that your changes to > get_salt() are acceptable since get_salt() can assume(*) that its input > is valid, but your removal of checks from valid() is not acceptable. Ok > (*) We should discuss this separately. In fact, I recall this was > already brought up in another thread. I think it's another task for > "robustness officer" to review loader.c very carefully, and to fuzz an > instrumented version of JtR with assertions added, and make sure that > only output from valid(), possibly passing through split(), can pass > into salt() and binary(). If this is not currently the case, then > whatever exceptions to this might exist should be patched in loader.c. > > Please write down the tasks I am mentioning here into some sort of to-do > list, such as a page you'd create on our wiki for the robustness project. I've posted the todo list but I'm yet to create a wiki page for it. -- Alexander Cherepanov
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.