|
Message-ID: <CAHv4kXgzahsk50ufbwCAE5TXB_h5VbC8PaGfBzTbgzNC_EtC4A@mail.gmail.com> Date: Mon, 24 Jun 2013 08:43:01 +0200 From: Jan Starke <jan.starke@...ofbed.org> To: john-users@...ts.openwall.com Subject: Re: Fuzzing with regular expressions Hello guys, rexgen does now support UTF-8 input, so for example rexgen 'M(ü|ö|ue|oe)ller' generates all 4 variants of this surname. Additionally, cmake now creates a Visual Studio solution which compiles rexgen.exe and librexgen-0.1.0.dll natively (assuming you have bison and flex available). Unfortunately, I didn't get find_library() running in Windows, so the Lua interface is currently not included on Windows. In order to integrate rexgen into JtR, what was the necessary requirements to the library? I'm currently thinking of - state serialization (for "john --restore") - ... ? Can you give me a good starting point into john's code, to read how those features of password generators are invoked by john? Kind regards, Jan 2013/5/22 magnum <john.magnum@...hmail.com> > On 22 May, 2013, at 12:40 , Jan Starke <jan.starke@...ofbed.org> wrote: > > 2013/5/22 magnum <john.magnum@...hmail.com> > >> I do not quite understand the section about Unicode. And it does not > seem > >> to work (my terminal is UTF-8): > >> > >> $ rexgen "M[üö]ller" > >> Mller > >> Mller > >> Mller > >> $ rexgen -u8 n "M[üö]ller" > >> Mller > >> Mller > >> Mller > >> > >> -DUTF_VARIANT=8 does not change the above, in case it was supposed to. > > > > rexgen currently cannot use Unicode strings as input, due to limitations > of > > the lexer (GNU flex). flex ignores any characters which are not known to > > it. If you want to generate unicode characters, you must specify them > with > > the \uxxxx syntax, e.g. > > > > rexgen 'M(ue|oe|\u00fc|\u00f6)ller' > > This contradicts the Unicode section on http://code.google.com/p/rexgen/so you might want to revise that. Or better, make the code work like the > docs says :-) > > > The aim of the options u8, u16 and u32 are to enforce the output > encoding. > > To verify this, you could create a hexdump of the output: > > > > rexgen 'test' | od -x > > OK, I see it now. This also contradicts the web docs: the default is UTF-8 > and not UTF-32. And in this case the actual behavior is better - defaulting > to UTF-32 would be very odd! > > magnum >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.