1996-06-29 - rsync and md4

Header Data

From: Andrew Tridgell <tridge@arvidsjaur.anu.edu.au>
To: cypherpunks@toad.com
Message Hash: 593b8a98419ded1d39b512f671c841fa63633ddcd912201fbd7fbdd6c7c3c0d8
Message ID: <96Jun29.135037+1000est.65075-20848+164@arvidsjaur.anu.edu.au>
Reply To: N/A
UTC Datetime: 1996-06-29 06:45:31 UTC
Raw Date: Sat, 29 Jun 1996 14:45:31 +0800

Raw message

From: Andrew Tridgell <tridge@arvidsjaur.anu.edu.au>
Date: Sat, 29 Jun 1996 14:45:31 +0800
To: cypherpunks@toad.com
Subject: rsync and md4
Message-ID: <96Jun29.135037+1000est.65075-20848+164@arvidsjaur.anu.edu.au>
MIME-Version: 1.0
Content-Type: text/plain


I've recently released a package called rsync that uses a checksum
search to provide very efficient file update over a slow link.  (see
ftp://samba.anu.edu.au/pub/rsync if you are interested)

Now I'd like to calculate some probabilities of failure of the
algorithm. The fundamental thing I need to know to do the calculation
is the probability of a random piece of data of length n having the
same md4 checksum as another given piece of data of the same length.

A first guess might be 2^-128 but I know that this sort of thing is
rarely that simple. Is md4 that good?

Note that I am not interested in "attacks" on md4 as such as the
source of the random data is just another file provided by the same
user, so it won't have been specially designed to defeat md4. 

If the probability is within a few orders of magnitude of 2^-128 then
can I also be sure that if I only use the first b bits of a md4
checksum it will be within a few orders of magnitude of 2^-b ? There
is an option in rsync to use a shorter checksum by truncating
md4. This saves some bytes on the link at the risk of lowering the
confidence. 

Why md4? I chose md4 because it seemed to be the fastest of the
reputedly strong, publicly available checksum algorithms. Suggestions
for alternative algorithms are welcome.

Cheers, Andrew





Thread