Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121210015748.GA5100@openwall.com>
Date: Mon, 10 Dec 2012 05:57:48 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: fixing the valid() methods

On Mon, Dec 10, 2012 at 02:37:41AM +0100, magnum wrote:
> 1. It's definitely suboptimal to do a strdup() before even checking the format tag. Is it not true that when loading eg. KoreLogic's millions-of-hashes files (and not specifying --format) each and every input line will be sent to every valid() of all the 200+ formats? I think that is what happens. Have this in mind when implementing valid()...
> 
> 2. I dislike strdup/free if you can avoid allocating at all... but I'm not sure it's that much a problem provided you fix #1. I tend to use strchr/strrchr on the original string instead.

I agree with these comments.

Additionally, during the Openwall meetup in November, Alexander and
Aleksey suggested that if folks (especially Dhiru) tend not to write
proper valid() or to make errors there, maybe this suggests that we
should provide a simpler way.  Specifically, it was mentioned that (as I
understood the suggestion) formats.c could provide a generic function
that would be callable from valid() and that would accept a regexp.
Its use would be something like:

#define CIPHERTEXT_REGEXP "^somethinghere$"

static int valid(char *ciphertext, struct fmt_main *self)
{
	return fmt_generic_valid(ciphertext, CIPHERTEXT_REGEXP);
}

...or maybe we could have a "char *ciphertext_regexp" field, which the
loader would check against before calling valid().  (Not everything can
be represented with a regexp, so support for valid() would need to
remain.)

Can we afford a dependency on regcomp(3), regexec(3), regfree(3) in
jumbo?  Apparently, these are in POSIX.1-2001.  Yet I am unlikely to
introduce this change to core, so it'd be yet another jumbo thing.

Personally, I don't feel much need to go for regexps - I find it easy
enough to write robust valid() based on str(r)chr() and such - but I
understand that others' preferences may be different.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.