john-dev - Re: displaying full meta information about hashes with --show=types

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150513173758.GA7502@openwall.com>
Date: Wed, 13 May 2015 20:37:58 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: displaying full meta information about hashes with --show=types

On Sun, May 10, 2015 at 09:43:45PM +0300, Aleksey Cherepanov wrote:
> I implemented --show=types option that prints all meta information
> about hashes from file. It tries all formats against all hashes and
> prints result in machine parseable format. It applies even formats
> that are disabled. It tries generic crypt always. It respects
> --format= option. It does not bypass john's heuristics for generic
> crypt.

I am almost done with it for now.

>From comments:

The format:
Once for each hash even if it can't be loaded (7 fields):
  login,
  ciphertext,
  uid,
  gid,
  gecos,
  home,
  shell.
For each valid format (may be nothing):
  label,
  is format disabled? (1/0),
  is format dynamic? (1/0),
  does format use the ciphertext field as is? (1/0),
  canonical hash or hashes (if there are several parts).

All fields above are separated by field_sep_char.
Formats are separated by empty field.

Additionally on the end:
  separator,
  line consistency mark (0/1/2/3):
    0 - the line is consistent and can be parsed easily,
    1 - field separator char occurs in fields, line can't
        be parsed easily,
    2 - the line was skipped as bad NIS stuff, only login
        and ciphertext are shown,
    3 - the line was skipped parsing descrypt with
        invalid salt, only ciphertext is shown (together
        with empty login); empty lines fall here,
  separator.

The additional field_sep_char at the end of line does not
break numeration of fields but allows parser to get
field_sep_char from the line.

A parser have to check the last 3 chars.

If the format does not use the ciphertext field as is,
then a parser have to match input line with output line
by number of line.

END

--show=types respects --bare-always-valid=[YN] (and related setting in
john.conf), it seems convenient. Though as I wrote before:
--bare-always-valid=Y and the setting work only if a hash on the first
line is bare.


John does not have such thing as original ciphertext. ciphertext
variable contains original ciphertext part of the time and not in all
file formats: file formats like pwdump have support built into
formats' prepare() methods (like nt).

NT's prepare() method returns a hash constructed from fields[3] if
fields[1] is not a hash and fields[2] is lm (hex x 32). It does not
return fields[3] as is, it adds $NT$ tag!

So there are two observations:

1) if prepared hash is not equal to contents of ciphertext variable,
then printed information is about the input line, not about particular
field. A parser may find the input line by number. I hope I print 1
line for each input line (I am not sure though).

The field that *ciphertext eq prepared is to simplify possible parser
or at least initial implementation:
  a) initially complex case may produce a load error,
  b) there are several format that work (login:hash:... and just hash on
  line, and maybe others).

2) --show=types tries a _line_ against formats, not ciphertext
variable. So if a line is in pwdump format, fields[2] is tried as LM,
fields[3] is tried as NT and nothing more because all(?) other formats
use fields[1] that is uid in pwdump format. Similar problem may exist
for l0phtcrack-style files and other formats that keep ciphertext not
(only) in fields[1].

It seems to be ok: pwdump is special. Other formats are special too.

Maybe --show=types should have a flag about *ciphertext eq fields[1].
Or just fields[1] and not ciphertext variable to make it less
confusing with pwdump format.

Or maybe it is an over-engineering.


I think there are minor problems that does not affect main usage (like
field sep char == '\n'). So I'll leave them for future improvements.


Showing of salt and costs for each hash may be interesting for johnny
and other tools to let user split hashes mindfully. It seems more
appropriate to a moment in workflow when we know the format of hashes.
I think another option may be introduced. Anyway it would be hard to
implement that at the place of --show=types. Maybe it should be a part
of --no-tty interface: john loads hashes and outputs what was loaded,
what salts ids were used and what costs are, then john proceed to
cracking. So --show=types does not show "full meta information about
hashes".

I am going to prepare pull request. A patch is attached.

Thanks!

-- 
Regards,
Aleksey Cherepanov

View attachment "full.patch" of type "text/x-diff" (12052 bytes)
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.