Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150906135244.GA2258@openwall.com>
Date: Sun, 6 Sep 2015 16:52:44 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: md5crypt-opencl

On Fri, Sep 04, 2015 at 10:43:54AM +0300, Solar Designer wrote:
> Another guess was that byte-sized accesses were causing the array to be
> placed in global memory, due to possible unavailability of such access
> modes for VGPRs (I don't recall whether this is the case or not).
> However, I've also since ruled this out (at least as the only cause), by
> changing the kernel such that there was no longer a single byte-sized
> access left in the generated ISA code.

Despite of the above, now that I play with a heavily cut-down (and thus
non-working) kernel I got it to a point where it uses scratch memory with:

	uint * string;
	[...]
	for (i = 0; i < 16; i++)
		ctx->buffer[i] = 0;
	for (i = 3; i < 64; i += 4)
		ctx->buffer[i / 4] |= (string[i / 4] >> ((i & 3) * 8)) & 0xff;

and implements the last of these loops with stores/loads, but it doesn't
with that loop changed to:

	for (i = 0; i < 64; i++)
		ctx->buffer[i / 4] |= (string[i / 4] >> ((i & 3) * 8)) & 0xff;

and implements it with shifts and masks.

Testing on a separate, even more heavily cut-down kernel, I determined
that it is in fact possible to have an array of almost 1 KB in VGPRs,
with no uses of scratch memory.  Problems arise when we do anything that
looks like byte-sized accesses, even when those are written with shifts
and masks in the source.

Perhaps in our cryptmd5_kernel.cl there are multiple reasons why the
compiler prefers to use scratch memory.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.