1996-02-01 - Re: parallel encryption

Header Data

From: Matthew Ghio <ghio@netcom.com>
To: cypherpunks@toad.com
Message Hash: b8759b95fac7b1ba02d81481610cdd86078cf918d82a75fbdbfff701d806146b
Message ID: <199602010608.WAA03210@myriad>
Reply To: <Pine.SUN.3.91.960131051351.13875A-100000@tipper.oit.unc.edu>
UTC Datetime: 1996-02-01 06:26:43 UTC
Raw Date: Thu, 1 Feb 1996 14:26:43 +0800

Raw message

From: Matthew Ghio <ghio@netcom.com>
Date: Thu, 1 Feb 1996 14:26:43 +0800
To: cypherpunks@toad.com
Subject: Re: parallel encryption
In-Reply-To: <Pine.SUN.3.91.960131051351.13875A-100000@tipper.oit.unc.edu>
Message-ID: <199602010608.WAA03210@myriad>
MIME-Version: 1.0
Content-Type: text/plain


ses@tipper.oit.unc.edu (Simon Spero) wrote:
> Ok, so I've got my BeBox and so finally have an SMP of my own again;
> anyone want to suggest any cool crypto stuff that parallelises well?
> Rogaways hashs look interesting, and nDES offers an obvious process
> network for pipelining, but what about things like running multiple
> interleaved CBC streams in parallel, with each stream starting off from a
> different IV? I can't think of any practical ways of speeding up a single
> RSA operation, although twice as many processors obviously gives twice
> the thruput.

One way to speed up RSA is to compute the series m^2, m^4, m^8, m^16...
on one processor and then multiply together the values for each one bit
in the decryption exponent on the other processor.  It's only about a 33%
speedup tho.  The other possibility is to compute the two 'halves' on
seperate processors when doing decryption.  I don't know of any way to
parallelize it to more than two processors for encrypting or more than
four for decrypting.

Discrete log systems are a bit more interesting in this respect - you can
precompute the series g^2, g^4, g^8...  (I think cryptolib does this)
then the initial parameter in a Diffie-Hellman exchange is simply the 
product of some elements of that series.  The multiplications can be
carried out in parallel in a hierarchial fashion which can be completed
in O(log(log(m))) time, where m is the modulus (assuming you have enough
processors).  However, for the second half of the exchange, you can't
precompute anything so you are stuck with the same problem as with RSA.





Thread