|
Message-ID: <CABh=JRHqTjYd8fxdx+MCd-voOwkgPAoZtfpbNzvUN=fCewB0Zg@mail.gmail.com>
Date: Tue, 8 May 2012 10:26:48 +0300
From: Milen Rangelov <gat3way@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: New RAR OpenCL kernel
I've never seen more than 3 of them in a row for a correct stream, yet
having 2 or 3 in a row is quite common. I am doing all my checks in
unpack29 though.
On Mon, May 7, 2012 at 10:00 PM, magnum <john.magnum@...hmail.com> wrote:
> On 05/07/2012 09:27 AM, Milen Rangelov wrote:
> > I think we can narrow the case further (that's what I did, still not sure
> > it's completely safe).
> >
> > To understand why a bad stream returns long series of same values, you
> need
> > to have a look at the end of rar_decode_number() function. You can do one
> > more stricter check: if rar_decode_number() returns repeatedly
> > decode->DecodeNum[0], then it's "likely" a bad stream. Still I think it
> > might happen for some valid stream too. But I guess there must be some
> > threshold that can be considered relatively safe.
>
> I totally missed that one. Even if decode->DecodeNum[0] is valid several
> times, I would *guess* this code is safe:
>
> @@ -387,7 +387,7 @@ int rar_decode_number(unpack_data_t *unpack_data,
> struct Decode *decode)
> rar_addbits(unpack_data, bits);
>
>
> n=decode->DecodePos[bits]+((bit_field-decode->DecodeLen[bits-1])>>(16-bits));
> if (n >= decode->MaxNum) {
> - n=0;
> + return -1;
> }
> rar_dbgmsg("rar_decode_number return(%d)\n", decode->DecodeNum[n]);
>
> ...and add code in read_tables() and rar_unpack29() so this -1
> propagates to an immediate bailout. Seems to work fine but this alone
> doesn't help performance as much as the threshold for repeated numbers.
> BTW forget what I wrote seeing 26 of them, I did not reset the counter
> properly. Besides, it depends on how/where you count them. I had the
> check in read_tables() but it should be inside rar_decode_number()
> itself (again returning -1 for bailout) because there are more callers.
>
> magnum
>
>
> > I think with ZIP we have somewhat similar situation - it is possible
> > (though very unlikely) that for a small file the verifier matches, file
> is
> > inflated correctly and checksum matches, yet password candidate is wrong.
> >
> >
> >
> > On Mon, May 7, 2012 at 2:59 AM, magnum <john.magnum@...hmail.com> wrote:
> >
> >> On 05/05/2012 02:09 AM, Milen Rangelov wrote:
> >>> Well, there are two cases here, a PPM block or a LZ block. The PPM case
> >>> would very quickly give you an error if stream is invallid and unpack29
> >>> would switch back to the pure LZSS case.
> >>
> >> Do you mean you modified something in unpack29 for this case? When I
> >> debugged it I got the impression it already rejects properly in this
> case.
> >>
> >>> Then for Lempel-Ziv it's highly
> >>> unlikely that you'd get the same result from rar_decode_number()
> several
> >>> times at a row unless the stream is bad. The key point here being
> >> "several"
> >>> and that's where my concerns are.
> >>
> >> Nice. I tested a little and I've seen the same result at most 26 times
> >> in a row for a valid stream while bad streams can have several hundreds.
> >> The question is what threshold would be safe...
> >>
> >> I think I'll have a look at Jim's pkzip code again, maybe something in
> >> there is usable for the LZ case.
> >>
> >> magnum
> >>
> >>
> >>
> >>> On Sat, May 5, 2012 at 2:45 AM, magnum <john.magnum@...hmail.com>
> wrote:
> >>>
> >>>> On 05/04/2012 08:19 PM, Milen Rangelov wrote:
> >>>>> OK, my tests are all successful until now. Speed is relatively the
> >> same,
> >>>>> doesn't matter whether the compressed file is a 4KB text file, or a
> >> 50MB
> >>>>> avi movie :) Unfortunately there is still overhead as compared to the
> >> HE
> >>>>> case, speed being somewhat 20% slower. I am still not 100% whether
> this
> >>>>> would hold true in all cases and if I am wrong, it might happen that
> >> for
> >>>>> some archives we can get false negatives.
> >>>>>
> >>>>> I am not yet 100% convinced I am correct yet though. The approach is
> >>>> based
> >>>>> on 2 assumptions, first one can easily be proved, the second one is
> >>>> related
> >>>>> to the LZ algorithm and I cannot yet prove what I check is safe and
> >> can't
> >>>>> occur in a valid LZ stream.
> >>>>
> >>>> This is good news Milen, just what I hoped for. I am SURE there are
> such
> >>>> short-cuts because cRARk runs full speed no matter what you throw at
> it.
> >>>> Please do share your findings, you know I would.
> >>>>
> >>>> magnum
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >
>
>
>
Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.