Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHv4kXg1VW2Ym+X0GEH4LezMMuREKAJe-DEsGq_xTEdjxYAvrQ@mail.gmail.com>
Date: Wed, 22 May 2013 21:45:56 +0200
From: Jan Starke <jan.starke@...ofbed.org>
To: john-users@...ts.openwall.com
Subject: Re: Fuzzing with regular expressions

>
> This contradicts the Unicode section on http://code.google.com/p/rexgen/so you might want to revise that. Or better, make the code work like the
> docs says :-)
>

This is a really cool challenge, as flex only supports single byte
character sets (if not only ASCII). Their are some really weird approaches
throughout the web. Maybe I will take a look at it. Until, I changed the
spec to match the code ;-)


> OK, I see it now. This also contradicts the web docs: the default is UTF-8
> and not UTF-32. And in this case the actual behavior is better - defaulting
> to UTF-32 would be very odd!
>

I updated the docs; thank you for the advice. The original approach of
using UTF-32 internally by default was driven by performance issues.
Handling UTF-32 is simpler than handling UTF-8. The current approach is
faster with UTF-8, which seems to be the better way...

Regards, Jan

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.