1995-01-20 - Re: traffic analyzing Chaum’s digital mix

Header Data

From: Hal <hfinney@shell.portal.com>
To: cypherpunks@toad.com
Message Hash: b25ff75b171f02de54f8c5c48d19e60e6323605f75d285e055242ea60ba05b34
Message ID: <199501201624.IAA13926@jobe.shell.portal.com>
Reply To: N/A
UTC Datetime: 1995-01-20 16:24:38 UTC
Raw Date: Fri, 20 Jan 95 08:24:38 PST

Raw message

From: Hal <hfinney@shell.portal.com>
Date: Fri, 20 Jan 95 08:24:38 PST
To: cypherpunks@toad.com
Subject: Re:  traffic analyzing Chaum's digital mix
Message-ID: <199501201624.IAA13926@jobe.shell.portal.com>
MIME-Version: 1.0
Content-Type: text/plain


-----BEGIN PGP SIGNED MESSAGE-----

From: Wei Dai <weidai@eskimo.com>
> I have been thinking about the problem of traffic analysis of a 
> remailer.
> [...]
> The basic approach is to use this raw traffic information to calculate a 
> SCORE for each user of the remailer with respect to Alice, where the 
> user with the highest SCORE is the person Alice is most probably 
> communicating with.  The idea is that with a Chaumian mix, every time 
> Alice sends a message to Bob there is always a pattern of Alice sending 
> a message to the mix, followed by Bob receiving a message from the mix 
> during the next batch.  By counting the number of such correlations for 
> each user over a period of time, and taking into account the fact that 
> users who receive more messages from the mix will have higher numbers
> of coincidental correlations, a SCORE can be calculated so that it would 
> be a good indication over the long run of the probability that a particular 
> user is communicating with Alice.

This sounds like a good idea.  It was very interesting to see your
earlier result on the impact of dummy messages on this approach.  Even a
relatively small number of batches without dummy messages allows
continual accumulation of incriminating information.

I know that the Eurocrypt 89 proceedings had some articles on
cryptanalyzing Chaum's mixes.  My library has an excellent crypto
selection but is missing this volume.  Can anyone who has read this say
whether there is anything in those papers that isn't obvious?

Another interesting aspect of your analysis is the possible role of
latency.  Earlier I had thought of latency as primarily a way of doing
mixing, an alternative or addition to batching which mixes messages
without holding them up quite as much.  But in terms of this in/out
analysis latency could play a part in blurring the batch boundaries,
adding more uncertainty and making the job of the analyst harder so he
would need more data to establish his scores.

Hal

-----BEGIN PGP SIGNATURE-----
Version: 2.6

iQBVAwUBLx/jixnMLJtOy9MBAQGFzwH/diYW0NSddacKyXGvsBc53FsR47R+4BSS
pVprHz2LfpVl7U2FFAePMjZIGr5w24hA6nxn1brAO9v6JkVzgUabvA==
=Vehs
-----END PGP SIGNATURE-----





Thread