1993-10-29 - ID of anonymous posters via word analysis?

Header Data

From: jpinson@fcdarwin.org.ec
To: cypherpunks@toad.com
Message Hash: 9906c0b9fd6c5b4ad0c02ce7befb4b6370428af45eea3e7a31e3ccc469d69345
Message ID: <9310290952.aa20334@pay.ecua.net.ec>
Reply To: N/A
UTC Datetime: 1993-10-29 16:18:24 UTC
Raw Date: Fri, 29 Oct 93 09:18:24 PDT

Raw message

From: jpinson@fcdarwin.org.ec
Date: Fri, 29 Oct 93 09:18:24 PDT
To: cypherpunks@toad.com
Subject: ID of anonymous posters via word analysis?
Message-ID: <9310290952.aa20334@pay.ecua.net.ec>
MIME-Version: 1.0
Content-Type: text/plain


All the talk recently about multiple fake identities reminded me
of a research project I read about a few years ago.   A team set
out to ascertain if Shakespeare was really one person, or actually
several people.

The researchers analyzed the frequency distribution of words
found in the works of Shakespeare, and compared them to the other
writers of the day.     I don't recall the results of the
project, but that kind of research would have implications for
anonymous postings.

It is not too difficult to see how certain spelling errors, word
frequency (how often do you say 'I':-) choice of wording, and the
working vocabulary of an individual could  allow you to
identify an anonymous poster.  This would be particularly easy if the
individual also posted under their real name.

I suspect that the government has done research on this topic. It
would be useful to identify which terrorist made which (written)
threat.

This brings up the subject of how one can post without
leaving an "ASCII fingerprint".  I suspect the use of a spelling
checker and grammatical checker would help.    Perhaps running
your text through a language converter, (say English to French)
then back would remove many identifying characteristics.



Jim Pinson                     Galapagos Islands
PGP key available by finger    jpinson@fcdarwin.org.ec






Thread