1993-10-14 - Generating random numbers from english text

Header Data

From: baldwin@LAT.COM (Bob Baldwin)
To: cypherpunks@toad.com
Message Hash: 5788db0a5dbda0dc07433d5784ed569c7146b5af038239f4258b9f602046cb85
Message ID: <9310141629.AA15178@LAT.COM>
Reply To: N/A
UTC Datetime: 1993-10-14 16:46:58 UTC
Raw Date: Thu, 14 Oct 93 09:46:58 PDT

Raw message

From: baldwin@LAT.COM (Bob Baldwin)
Date: Thu, 14 Oct 93 09:46:58 PDT
To: cypherpunks@toad.com
Subject: Generating random numbers from english text
Message-ID: <9310141629.AA15178@LAT.COM>
MIME-Version: 1.0
Content-Type: text/plain


Hello,
	With the current random number discussion going on, I thought
I would point out one convenient way to generate random numbers in
the situation where you do not need to generate them frequently.  I believe
that Claude Shannon (Dr. Information Theory) proposed this:

1. Ask a person to speak about any topic for a paragraph or two.
   Instruct them to generate original sentences, not just repeats of
   some passage of a written work.  Write down what they say.

2. The english language has an entropy of 1.0 to 1.5 bits per letter,
   so it is safe to extract one bit per character.  If you read Shannon's
   work and its modern interpretation, you will understand that this
   bit per character is truely random.  It is uncorrelated with all
   the other bits.  The reason for avoiding a pre-existing written
   work (poem, article, story, etc), is to avoid a brute force search
   of the source language space.

	The modern version of this algorithm, is to compute the
MD5 digest of a long text string.  For a 128 bit digest, you need
at least 128 characters of source language.  For a larger random number,
you can concatenate multiple MD5 values from multiple pieces of source text.

	If the random numbers form the basis of crypto keys, then it
is important to make sure no one can uncover the original source text.

		--Bob






Thread