1994-09-14 - Re: alleged-RC4

Header Data

From: Bill Sommerfeld <sommerfeld@orchard.medford.ma.us>
To: Hal <hfinney@shell.portal.com>
Message Hash: 00a94b0db10c46941302d6101f7dbf8f490bdebd501224cf0ac5d20487d282c1
Message ID: <199409141503.LAA00499@orchard.medford.ma.us>
Reply To: <199409131806.LAA05147@jobe.shell.portal.com>
UTC Datetime: 1994-09-14 15:17:12 UTC
Raw Date: Wed, 14 Sep 94 08:17:12 PDT

Raw message

From: Bill Sommerfeld <sommerfeld@orchard.medford.ma.us>
Date: Wed, 14 Sep 94 08:17:12 PDT
To: Hal <hfinney@shell.portal.com>
Subject: Re: alleged-RC4
In-Reply-To: <199409131806.LAA05147@jobe.shell.portal.com>
Message-ID: <199409141503.LAA00499@orchard.medford.ma.us>
MIME-Version: 1.0
Content-Type: text/plain


Actually, in looking at the assembly code generated by three different
compilers (GCC on i386, GCC on PA, and HP's PA compiler), strangely
enough, the `% 256' should be `& 0xff' (it shaves a few instructions
off the inner loop for some reason which isn't immediately apparant to
me..).

On the PA, I got a ~30% speedup by unrolling the inner loop 4x,
assembling the pad into an `unsigned long', and doing one 4-byte-wide
XOR with the user data.  I think most of the speedup comes from giving
the instruction scheduler more instructions to reorder to avoid
load-store conflicts.  Your milage will vary on other architectures.

					- Bill





Thread