|
|
Message-ID: <4E466D8E.9050101@bredband.net>
Date: Sat, 13 Aug 2011 14:26:54 +0200
From: magnum <rawsmooth@...dband.net>
To: john-dev@...ts.openwall.com
Subject: Re: Unicode, casing, obtaining data, and some real-world
MSSQL (2000) data.
On 2011-08-12 22:49, jfoug wrote:
>> Do you mean the reinstated "third case" in utf8towcs()?
>
> I believe so. There were a couple of if blocks which printf error codes, and 'tried' to correct their location within the data stream. I commented both of those out, at this time. I know it is not right, and we will have to work through what 'is' right, but it allowed the format to process every data point from U+0 to U+FFFF.
>
> It is likely that I was simply spitting out invalid nonsense data, and the code was correct, in 'expecting' another UTF16 character, which was not present. However, I think this is simply garbage avoidance code. We simply have to get it where it keeps the process image 'safe', and does not output unneeded warnings. Like I said, what I initially publish will likely need some tuning. However, I do not think this would cause anyones cracking job to have any heartburn at all, right now.
OK, you mean in utf16toutf8_r()
...
} else { /* it's an unpaired high surrogate */
--source; /* return to the illegal value itself */
fprintf(stderr, "warning, utf16toutf8 failed (illegal)
- this is a bug in JtR\n");
break;
}
} else { /* We don't have the 16 bits following the high surrogate. */
--source; /* return to the high surrogate */
fprintf(stderr, "warning, utf16toutf8 failed (no surrogate) -
this is a bug in JtR\n");
break;
}
...
The original code puts in replacement characters and uses return codes
(conversion_fail etc) but since this function is only used for
converting back from UTF-16 that we "know" is correct, I simplified it
and put those error messages in there. If you hit those clauses we
should find out why.
magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.