|
Message-ID: <4E466D8E.9050101@bredband.net> Date: Sat, 13 Aug 2011 14:26:54 +0200 From: magnum <rawsmooth@...dband.net> To: john-dev@...ts.openwall.com Subject: Re: Unicode, casing, obtaining data, and some real-world MSSQL (2000) data. On 2011-08-12 22:49, jfoug wrote: >> Do you mean the reinstated "third case" in utf8towcs()? > > I believe so. There were a couple of if blocks which printf error codes, and 'tried' to correct their location within the data stream. I commented both of those out, at this time. I know it is not right, and we will have to work through what 'is' right, but it allowed the format to process every data point from U+0 to U+FFFF. > > It is likely that I was simply spitting out invalid nonsense data, and the code was correct, in 'expecting' another UTF16 character, which was not present. However, I think this is simply garbage avoidance code. We simply have to get it where it keeps the process image 'safe', and does not output unneeded warnings. Like I said, what I initially publish will likely need some tuning. However, I do not think this would cause anyones cracking job to have any heartburn at all, right now. OK, you mean in utf16toutf8_r() ... } else { /* it's an unpaired high surrogate */ --source; /* return to the illegal value itself */ fprintf(stderr, "warning, utf16toutf8 failed (illegal) - this is a bug in JtR\n"); break; } } else { /* We don't have the 16 bits following the high surrogate. */ --source; /* return to the high surrogate */ fprintf(stderr, "warning, utf16toutf8 failed (no surrogate) - this is a bug in JtR\n"); break; } ... The original code puts in replacement characters and uses return codes (conversion_fail etc) but since this function is only used for converting back from UTF-16 that we "know" is correct, I simplified it and put those error messages in there. If you hit those clauses we should find out why. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.