Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120809182204.GS27715@brightrain.aerifal.cx>
Date: Thu, 9 Aug 2012 14:22:05 -0400
From: Rich Felker <dalias@...ifal.cx>
To: musl@...ts.openwall.com
Subject: Re: crypt* files in crypt directory

On Thu, Aug 09, 2012 at 08:43:55PM +0400, Solar Designer wrote:
> On Thu, Aug 09, 2012 at 01:58:12PM +0200, Szabolcs Nagy wrote:
> > it's scary how these crypto codes mix various
> > signed and unsigned integer types
> 
> Yes, there's room for improvement here.  I'd probably write this
> differently if I were writing it from scratch now.  Perhaps a cleanup is
> in order, similarly to what we did for the FreeSec code recently.
> Specifically, we could have a higher-level wrapper function accept char *
> like crypt() is defined to accept, but pass these to a function accepting
> unsigned char * and go with that from this point.

I actually like this idea a lot. It should go a long way to clarify
the code and make it obvious to readers that no further signed-char
sign extension bugs exist.

> > > typedef unsigned int BF_word;
> > > typedef signed int BF_word_signed;
> > 
> > i don't like these typedefs
> > it seems the code assumes 32bit unsigned BF_word
> 
> Yes.  As you suggested while we were discussed FreeSec, in musl's
> revision of the code we should use <stdint.h> types here.  I made those
> changes to that code, but not yet to crypt_blowfish code - I think Rich
> will do that.

Yes, I think this change should be made, for clarity if nothing else.
It's non-obvious reading the code that BF_word is intended to be a
fixed-size 32-bit type (i.e. that arithmetic on it is intended to be
modulo 2^32).

> > tmp1 = ctx->S[3][L & 0xff];
> > tmp2 = ctx->S[2][L>>8 & 0xff];
> > tmp3 = ctx->S[1][L>>16 & 0xff];
> > tmp4 = ctx->S[0][L>>24 & 0xff];
> > R ^= ctx->P[N+1];
> > R ^= ((tmp3 + tmp4) ^ tmp2) + tmp1;
> 
> I think this rewrite is likely to cause a slowdown with some compilers.

When we're done with everything else, I'd like to see a comparison
with my version of the code (no explicit scheduling, straightforward
expression with no temps) versus the original on modern gcc (and
possibly clang). I would expect them to perform identically, in which
case I'd choose code clarity. I like C that reads like C rather than
reading like decompiled asm.. :-)

> > > 	do {
> > > 		ptr += 2;
> > > 		L ^= ctx->s.P[0];
> > > 		BF_ROUND(L, R, 0);
> > > 		BF_ROUND(R, L, 1);
> > > 		BF_ROUND(L, R, 2);
> > > 		BF_ROUND(R, L, 3);
> > > 		BF_ROUND(L, R, 4);
> > > 		BF_ROUND(R, L, 5);
> > > 		BF_ROUND(L, R, 6);
> > > 		BF_ROUND(R, L, 7);
> > > 		BF_ROUND(L, R, 8);
> > > 		BF_ROUND(R, L, 9);
> > > 		BF_ROUND(L, R, 10);
> > > 		BF_ROUND(R, L, 11);
> > > 		BF_ROUND(L, R, 12);
> > > 		BF_ROUND(R, L, 13);
> > > 		BF_ROUND(L, R, 14);
> > > 		BF_ROUND(R, L, 15);
> > > 		tmp4 = R;
> > > 		R = L;
> > > 		L = tmp4 ^ ctx->s.P[BF_N + 1];
> > > 		*(ptr - 1) = R;
> > > 		*(ptr - 2) = L;
> > > 	} while (ptr < end);
> > 
> > why increase ptr at the begining?
> 
> To hide the latency.  Indeed, a smart compiler should do this on its
> own, but not all do it well.

I suspect your version will generate worse code (both larger and
slower) on cpus like arm that can increment the index/pointer register
and dereference it at the same time.

> > > 			tmp[1] |= (BF_word_signed)(signed char)*ptr; /* bug */
> > 
> > i don't think the BF_word_signed cast helps here
> > signed char is promoted to int and then
> > it will be converted to BF_word
> 
> Yes, I just like it explicit to show what exactly is happening, because
> it matters a lot.

The cast is not the same as what's happening implicitly.

XXX

> 
> > > 	if (setting[0] != '$' ||
> > > 	    setting[1] != '2' ||
> > > 	    setting[2] < 'a' || setting[2] > 'z' ||
> > > 	    !flags_by_subtype[(unsigned int)(unsigned char)setting[2] - 'a'] ||
> > 
> > setting[2] is already range checked so
> > the casts are not necessary
> > (assuming 'a' < 'z')
> 
> Casting chars to (unsigned int)(unsigned char) for array indices is an
> idiom for me.  Now, we have a promotion to (int) due to "- 'a'", but if
> this is ever compiled with a C++ compiler, the 'a' constant would be of
> type char rather than int, IIRC.  So better safe than sorry.

No, all arithmetic and logical operators are subject to default
promotions in both C and C++. 'b'-'a' has type int in both languages
despite the fact that 'a' and 'b' have type char in C++ (and int in
C). The only possible function the (unsigned int) cast can have is
forcing promotion of another value in the expressione from int to
unsigned int.

> Yes, the range checking should have helped in this case, but I recall
> seeing weird behavior with in-range (implies non-negative) char
> subscripts on Alpha in 1990s before I started using this idiom.  IIRC,
> instructions were generated such that high bytes of the array index
> register were not cleared before the lookup.  Compiler bug?  But it was
> seen with different compilers.

I suspect you're thinking of something different. There's definitely
no way char types could be usable at all if the high bytes of
registers loaded from chars were full of random junk. This compiler
bug sounds too serious to be plausible.

> > i'd use for (;;) for infinite loops
> 
> I saw Rich do that, but I kept the infinite loops in this file I posted
> consistent with those I use elsewhere for now.  Rich can musl'ify this
> code in this way if desired.

I just did it when removing the while condition from do/while loops,
not gratuitously changing existing infinite loops. I didn't even think
of the possibility of leaving it as do/while(1) because I rarely/never
use that idiom for infinite loops. I don't care which is used though.

> > for (count = 0; count < 64; count++)
> > 
> > i assume it will be optimized to the same thing
> > (count is not used later)
> 
> I hope/expect that it will be optimized to the same thing, but I am not
> sure it'd happen with all relevant compilers all the time.  I like
> writing performance-critical code closer to the desired assembly code.

This is a simple strength reduction optimization and every gcc version
since 2.7.2 or so has done it, if I'm not mistaken. I don't care which
way it's written though; either is clear and idiomatic C.

> > if it does not fail then errno is unchanged i guess
> 
> Yes, this should be the case currently.  I just did not want to have a
> pitfall in there in case of future changes to BF_crypt() where errno
> could potentially be altered on a successful call.

In principle this could happen if arbitrary library functions are
being called, but in practice it shouldn't. Anyway EINVAL is the only
error we support here, so it could just be set just before return if
an error was encountered. Or if we return fake hashes that can never
match instead of null on error, there's no sense in ever setting
errno to begin with.

> > > 	memcpy(buf.s, test_setting, sizeof(buf.s));
> > 
> > sizeof buf.s
> > without ()
> 
> I am consistently using sizeof() with the braces.  Maybe musl's coding
> style is different.  If so, this may be changed indeed.

I don't care if it's inconsistent in code that was written outside of
musl, but the idiom I use is to only include parentheses for types.
Aside from my dislike for gratuitous parens, this is to make it clear
that we're taking the size of an object rather than the size of a
type.

> > > 	    test_hash[(unsigned int)(unsigned char)buf.s[2] & 1],
> > 
> > the extra (unsigned int) cast is unnecessary
> 
> Yes, we could break the idiom and rely on the promotion to int here.
> (Without such promotion or cast, IIRC, things would sometimes fail on
> Alpha.)

I smell cargo culting...

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.