1996-05-29 - Statistical analysis of anonymous databases

Header Data

From: Clay.Olbon@dynetics.com (Clay Olbon II)
To: cypherpunks@toad.com
Message Hash: c8debcd2dffb20aeeb49e3f1435c6bab105b0887a19b9b5d8df0601afba21c71
Message ID: <v01540b02add1fc6e4658@[193.239.225.200]>
Reply To: N/A
UTC Datetime: 1996-05-29 18:19:05 UTC
Raw Date: Thu, 30 May 1996 02:19:05 +0800

Raw message

From: Clay.Olbon@dynetics.com (Clay Olbon II)
Date: Thu, 30 May 1996 02:19:05 +0800
To: cypherpunks@toad.com
Subject: Statistical analysis of anonymous databases
Message-ID: <v01540b02add1fc6e4658@[193.239.225.200]>
MIME-Version: 1.0
Content-Type: text/plain


I ran across an interesting problem on the STAT-L mailing list.  I came up
with an initial solution, but it didn't fully solve the problem.  I will
summarize:

In medical research (this particular application - there are others I am
sure) it is desirable to have a large database of individual medical
histories available to search for correlations, risk factors, etc.  The
problem, of course, is that many individuals want their medical histories
kept private.  It is therefore necessary to maintain a database that is not
traceable back to individuals.  An additional requirement is that people
must be able to add additional information to their records as it becomes
available.  The researcher who initially posed the question suggested
adding random data to "encrypt anonymity".

My first cut solution was to hash the individual's name (perhaps including
some other info or random info to thwart dictionary attacks) and send the
records in under the hashed name.  If done correctly, this should protect
the anonymity of the record.  The problem with this is that with the volume
of data available in a medical record, it is very probable that a person
could be tied to that record.

Does anyone have any insights into this problem?  <disclaimer> This is of
purely academic interest to me, I don't know the person who asked the
intial question (other than through email).  It just sounds like a neat
problem. </disclaimer>

        Clay






---------------------------------------------------------------------------
Clay Olbon II            | Clay.Olbon@dynetics.com
Systems Engineer         | ph: (810) 589-9930 fax 9934
Dynetics, Inc., Ste 302  | http://www.msen.com/~olbon/olbon.html
550 Stephenson Hwy       | PGP262 public key: on web page
Troy, MI 48083-1109      | pgp print: B97397AD50233C77523FD058BD1BB7C0
                     TANSTAAFL
---------------------------------------------------------------------------







Thread