Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120917030241.GH254@brightrain.aerifal.cx>
Date: Sun, 16 Sep 2012 23:02:41 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: musl 0.9.5 release and new website

On Sun, Sep 16, 2012 at 11:42:08PM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@...ifal.cx> [2012-09-15 23:29:31 -0400]:
> > On Sat, Sep 15, 2012 at 03:53:41PM +0200, Szabolcs Nagy wrote:
> > > 	while (i < 16) {
> > > 		FF(a,b,c,d, W[i],  7, tab[i]); i++;
> > > 		FF(d,a,b,c, W[i], 12, tab[i]); i++;
> > > 		FF(c,d,a,b, W[i], 17, tab[i]); i++;
> > > 		FF(b,c,d,a, W[i], 22, tab[i]); i++;
> > > 	}
> > 
> > This is more of the same old ugly manual unrolling. The file is small
> > as-is, but I think it could be a lot smaller with -Os (and same speed
> > as now with -O3) if the manual unrolling were removed.
> > 
> 
> ok i removed the unrolling, the difference is about 200 bytes

Thanks. Unfortunately it's 10% slower at -O3 (and about 20% slower at
-Os), but as since there doesn't seem to be any way to configure
rounds, crypt_md5 performance is probably mostly irrelevant.

> is the 30K key limit reasonable?

I don't know; can you explain the motivation?

> -#define FF(a,b,c,d,w,s,t) a += F(b,c,d) + w + t; a = rol(a,s) + b
> -#define GG(a,b,c,d,w,s,t) a += G(b,c,d) + w + t; a = rol(a,s) + b
> -#define HH(a,b,c,d,w,s,t) a += H(b,c,d) + w + t; a = rol(a,s) + b
> -#define II(a,b,c,d,w,s,t) a += I(b,c,d) + w + t; a = rol(a,s) + b
> +#define FF(a,b,c,d,w,r,t) a += F(b,c,d) + w + t; a = rol(a,r) + b
> +#define GG(a,b,c,d,w,r,t) a += G(b,c,d) + w + t; a = rol(a,r) + b
> +#define HH(a,b,c,d,w,r,t) a += H(b,c,d) + w + t; a = rol(a,r) + b
> +#define II(a,b,c,d,w,r,t) a += I(b,c,d) + w + t; a = rol(a,r) + b

Is this changing anything but the argument name? Why the change?

> +static const uint8_t idx[64] = {
> +0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,
> +1,6,11,0,5,10,15,4,9,14,3,8,13,2,7,12,
> +5,8,11,14,1,4,7,10,13,0,3,6,9,12,15,2,
> +0,7,14,5,12,3,10,1,8,15,6,13,4,11,2,9
> +};
> +static const uint8_t rot[64] = {
> +7,12,17,22,7,12,17,22,7,12,17,22,7,12,17,22,
> +5,9,14,20,5,9,14,20,5,9,14,20,5,9,14,20,
> +4,11,16,23,4,11,16,23,4,11,16,23,4,11,16,23,
> +6,10,15,21,6,10,15,21,6,10,15,21,6,10,15,21

It would be nice if these could be done without tables. As-is, I'm not
really sure the the de-unrolled code is all that much cleaner than the
original, but at least it's slightly smaller...

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.