1996-10-18 - [CORRECTION] Re: DES cracker.

Header Data

From: Adam Back <aba@dcs.ex.ac.uk>
To: aba@dcs.ex.ac.uk
Message Hash: 8d0ceb18407da823f1ce31e2b3363c5d5e3539051078e21d89386ee3345dce62
Message ID: <199610180227.DAA00352@server.test.net>
Reply To: <199610171533.QAA00477@server.test.net>
UTC Datetime: 1996-10-18 17:09:23 UTC
Raw Date: Fri, 18 Oct 1996 10:09:23 -0700 (PDT)

Raw message

From: Adam Back <aba@dcs.ex.ac.uk>
Date: Fri, 18 Oct 1996 10:09:23 -0700 (PDT)
To: aba@dcs.ex.ac.uk
Subject: [CORRECTION] Re: DES cracker.
In-Reply-To: <199610171533.QAA00477@server.test.net>
Message-ID: <199610180227.DAA00352@server.test.net>
MIME-Version: 1.0
Content-Type: text/plain



Eric Young pointed out to me a factor of 8 error in my DES CBC block
timing cacluations (I used megabytes, Erics figures are in megabytes,
Peter's are in mega _bits_, this fact escaped my notice, though I must
say I was very impressed with Peters optimisations :-).

Peter since posted his cool keysched optimisations, which change
things also.

So here's the table revisited (extrapolated to 133Mhz Pentium, from
Peter's 90Mhz to keep it in line with the previously posted table):

keysched = 3.8us
DES cbc = 7.3us

n keys    |  key   |   DES  | elapsed for | elapsed per |
at a time |  sched |   cbc  | n key tests | key test    | keys/sec
-----------------------------------------------------------------------
     1       3.4us    21.9us      25.3us         25.3us    39.5k keys/s
     2       3.4us    43.8us      47.2us         23.6us    42.4k keys/s
     4       3.4us    87.6us      91.0us         22.7us    44.0k keys/s
     8       3.4us   175.2us     178.6us         22.3us    44.8k keys/s
    16       3.4us   350.4us     353.8us         22.1us    45.2k keys/s
    32       3.4us   700.8us     704.2us         22.0us    45.4k keys/s
    64       3.4us  1401.6us    1405.0us         22.0us    45.6k keys/s
   128       3.4us  2803.2us    2806.6us         21.9us    45.6k keys/s

So as you can see this greatly reduces the gains to be made from
multiple keys.  Not worth doing more than 64 keys, and 64 keys only
buys 15% increase in keys/sec.

When Peter adds what he calls the "glue" code in his paper (extra code
to move data, compare results etc), the advantage of multiple keys may
go down further.

Also if the extra code for testing multiple keys pushes the code
requirements over the 8k L1 code cache, or the extra data space pushes
the data over the 8k L1 data cache, this may lose more than is gained.
The extra code complexity will add a small amount of overhead too.

The data requirements for multiple keys aren't that high. (Extra data
is number of blocks required per test x 8 byte block size = 64 x 8 x 3
= 1.5k, or 384 bytes if you restrict yourself to 16 keys at once, and
lose the last 1% gain).

Adam
--
print pack"C*",split/\D+/,`echo "16iII*o\U@{$/=$z;[(pop,pop,unpack"H*",<>
)]}\EsMsKsN0[lN*1lK[d2%Sa2/d0<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<J]dsJxp"|dc`





Thread