From: Fred <admin@dcwill.com>
To: cypherpunks@toad.com
Message Hash: 48e1e381bc7b4b276d93cd563a2237e2ce3f4dc898315b7509c25aa28b2c2c4a
Message ID: <199603270200.SAA18549@python.ee.unr.edu>
Reply To: N/A
UTC Datetime: 1996-03-28 04:35:52 UTC
Raw Date: Thu, 28 Mar 1996 12:35:52 +0800
From: Fred <admin@dcwill.com>
Date: Thu, 28 Mar 1996 12:35:52 +0800
To: cypherpunks@toad.com
Subject: Random Number Testing
Message-ID: <199603270200.SAA18549@python.ee.unr.edu>
MIME-Version: 1.0
Content-Type: text/plain
JonWienke@aol.com writes:
> > The "randomness" of the output of my sieve is obviously not perfect.
> > However, it will remove bytes from a data stream that "randomness" tests
> > describe as "nonrandom".
And Perry writes:
> You are displaying a profound ignorance of the notion of "random"
> here. An individual byte cannot be random or nonrandom. Random is a
> statistical property, and therefore can only apply to a mass of
> data.
The limitation seems to be in the testing process.
An inviolable rule of test and measurement is that the testing process
must have a higher degree of accuracy and precision than is expected from
the item under test. If all you have is a non-graduated stick one meter
long, you can't measure increments of less than one meter unless you add
some additional resolution to the system (such as a human's judgment
that the object is four sticks, plus about another third of a stick,
long).
Our ability to synthesize pseudo-random data, and the degree to which
it approaches true random data, is limited by our ability to discern
between the two. If the best testing process available yields the
same results for the output of a true RNG and a particular PRNG, what
can be done to improve the randomness of the PRNG? How could we tell
the difference based on the data itself?
About all you could do would be to critique the PRNG or sieving process
from an empirical perspective and hope that you weren't introducing some
other non-random bias to the system. I'm sure that some perfectly random
data would be discarded as being non-random, especially for a small sample
size. The sequence HHHHTTTT doesn't look random, but this outcome of 8
flips is equally probable for a "perfect" coin. Do you keep this data or
toss it because it doesn't "look" random? If you toss it, aren't you
removing entropy from the sample?
It may be (relatively) easy to say that radioactive decay is a truly
random process and a sieve which discards certain re-occuring values
is not. Improving the testing process, and increasing the size of the
data used for that testing, will improve the sieve output. However,
it looks like an asymptotic situation: the PRNG can approach, but never
quite reach, the true RNG case. As long as human analysis and discretion
is involved, there will always be a certain amount of non-randomness
in the output of a PRNG.
My question: if you don't know the source of the data, how can you
_really_ determine if it's random enough for crypto use? Given any
PRNG string of a certain length, does the fact that I might be able to
find that exact same data string buried in a mountain of truly random
data mean that it's suitable for crypto use? If you knew that I created
the data from the clock of my workstation, then maybe not. But otherwise?
Fred <admin@dcwill.com>
Return to March 1996
Return to “Fred <admin@dcwill.com>”
1996-03-28 (Thu, 28 Mar 1996 12:35:52 +0800) - Random Number Testing - Fred <admin@dcwill.com>