|
Message-ID: <6641e407a08ffc060f38e2efdab7e4bf@smtp.hushmail.com> Date: Wed, 09 May 2012 02:13:11 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: New RAR OpenCL kernel OK, then I might have got equivalent code now. I made 326 test rar files and used a threshold of 3. All but 9 was cracked. Raised threshold to 4 and got 4 more. Raised to 5 and got all but one. That last one need a threshold of 10. All these were pretty small files (*.[ch] from john's src dir, the worst file was md5_eq.c rar'ed with -ptest) magnum On 05/08/2012 09:26 AM, Milen Rangelov wrote: > I've never seen more than 3 of them in a row for a correct stream, yet > having 2 or 3 in a row is quite common. I am doing all my checks in > unpack29 though. > > On Mon, May 7, 2012 at 10:00 PM, magnum <john.magnum@...hmail.com> wrote: > >> On 05/07/2012 09:27 AM, Milen Rangelov wrote: >>> I think we can narrow the case further (that's what I did, still not sure >>> it's completely safe). >>> >>> To understand why a bad stream returns long series of same values, you >> need >>> to have a look at the end of rar_decode_number() function. You can do one >>> more stricter check: if rar_decode_number() returns repeatedly >>> decode->DecodeNum[0], then it's "likely" a bad stream. Still I think it >>> might happen for some valid stream too. But I guess there must be some >>> threshold that can be considered relatively safe. >> >> I totally missed that one. Even if decode->DecodeNum[0] is valid several >> times, I would *guess* this code is safe: >> >> @@ -387,7 +387,7 @@ int rar_decode_number(unpack_data_t *unpack_data, >> struct Decode *decode) >> rar_addbits(unpack_data, bits); >> >> >> n=decode->DecodePos[bits]+((bit_field-decode->DecodeLen[bits-1])>>(16-bits)); >> if (n >= decode->MaxNum) { >> - n=0; >> + return -1; >> } >> rar_dbgmsg("rar_decode_number return(%d)\n", decode->DecodeNum[n]); >> >> ...and add code in read_tables() and rar_unpack29() so this -1 >> propagates to an immediate bailout. Seems to work fine but this alone >> doesn't help performance as much as the threshold for repeated numbers. >> BTW forget what I wrote seeing 26 of them, I did not reset the counter >> properly. Besides, it depends on how/where you count them. I had the >> check in read_tables() but it should be inside rar_decode_number() >> itself (again returning -1 for bailout) because there are more callers. >> >> magnum >> >> >>> I think with ZIP we have somewhat similar situation - it is possible >>> (though very unlikely) that for a small file the verifier matches, file >> is >>> inflated correctly and checksum matches, yet password candidate is wrong. >>> >>> >>> >>> On Mon, May 7, 2012 at 2:59 AM, magnum <john.magnum@...hmail.com> wrote: >>> >>>> On 05/05/2012 02:09 AM, Milen Rangelov wrote: >>>>> Well, there are two cases here, a PPM block or a LZ block. The PPM case >>>>> would very quickly give you an error if stream is invallid and unpack29 >>>>> would switch back to the pure LZSS case. >>>> >>>> Do you mean you modified something in unpack29 for this case? When I >>>> debugged it I got the impression it already rejects properly in this >> case. >>>> >>>>> Then for Lempel-Ziv it's highly >>>>> unlikely that you'd get the same result from rar_decode_number() >> several >>>>> times at a row unless the stream is bad. The key point here being >>>> "several" >>>>> and that's where my concerns are. >>>> >>>> Nice. I tested a little and I've seen the same result at most 26 times >>>> in a row for a valid stream while bad streams can have several hundreds. >>>> The question is what threshold would be safe... >>>> >>>> I think I'll have a look at Jim's pkzip code again, maybe something in >>>> there is usable for the LZ case. >>>> >>>> magnum >>>> >>>> >>>> >>>>> On Sat, May 5, 2012 at 2:45 AM, magnum <john.magnum@...hmail.com> >> wrote: >>>>> >>>>>> On 05/04/2012 08:19 PM, Milen Rangelov wrote: >>>>>>> OK, my tests are all successful until now. Speed is relatively the >>>> same, >>>>>>> doesn't matter whether the compressed file is a 4KB text file, or a >>>> 50MB >>>>>>> avi movie :) Unfortunately there is still overhead as compared to the >>>> HE >>>>>>> case, speed being somewhat 20% slower. I am still not 100% whether >> this >>>>>>> would hold true in all cases and if I am wrong, it might happen that >>>> for >>>>>>> some archives we can get false negatives. >>>>>>> >>>>>>> I am not yet 100% convinced I am correct yet though. The approach is >>>>>> based >>>>>>> on 2 assumptions, first one can easily be proved, the second one is >>>>>> related >>>>>>> to the LZ algorithm and I cannot yet prove what I check is safe and >>>> can't >>>>>>> occur in a valid LZ stream. >>>>>> >>>>>> This is good news Milen, just what I hoped for. I am SURE there are >> such >>>>>> short-cuts because cRARk runs full speed no matter what you throw at >> it. >>>>>> Please do share your findings, you know I would. >>>>>> >>>>>> magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.