1994-01-25 - Provability and Randomness

Header Data

From: collins@newton.apple.com (Scott Collins)
To: cypherpunks@toad.com
Message Hash: 1dd397db40f7e98db737a8ce6cf32cdf57ac182052b80302e6cb0151d092dc90
Message ID: <9401252141.AA15906@newton.apple.com>
Reply To: N/A
UTC Datetime: 1994-01-25 21:56:53 UTC
Raw Date: Tue, 25 Jan 94 13:56:53 PST

Raw message

From: collins@newton.apple.com (Scott Collins)
Date: Tue, 25 Jan 94 13:56:53 PST
To: cypherpunks@toad.com
Subject: Provability and Randomness
Message-ID: <9401252141.AA15906@newton.apple.com>
MIME-Version: 1.0
Content-Type: text/plain


Entropy is relative.

A string is `random' (with respect to an observer) when the probability of
correctly predicting the next symbol of the string is arbitrarily low
(e.g., size_of_the_alphabet^-1).  Entropy, and therefore `randomness' can
only be considered in the presence of symbol probabilities... and therefore
prejudicial knowledge i.e., a context (algorithms, models, history,
whatever).

Different contexts --> different probabilities --> different quality of
randomness.

 * Absent a context, there is no such thing as `randomness'.

Posit two identical contexts, sender and receiver.  Sender transmits a
`random' string to receiver.  (Beeeeeeeeeeep.  Sorry, that was the warning
that sounds whenever I fib).  The sender can only send a random string if
the reciever doesn't already `know' that string or doesn't know which
string the sender will transmit.  If the sender knows something that the
reciever doesn't then the contexts are not identical.

 * Absent differing contexts, there is no such thing as randomness.

A fair coin toss can be random because you_before_the_toss and
you_after_the_toss are different contexts (reciever and sender,
respectively; one of whom knows the outcome).

Posit two disjoint contexts, A and B.  A transmits a message to B, who has
no information in common with A.  B has no context with which to predict
the first symbol that will appear and thus it is always random.  As symbols
appear, B builds a model of A... and thus acquires knowledge of A (i.e., a
shared context).  By the end of the message, B might be predicting quite
well.  If B can't build any model of A's behavior at all, then B will share
no context with A; won't be able to predict characters; the string will
remain random.

 * Absent shared knowledge (overlapping context), all information is random.

Imagine that B's shared knowledge with A takes the form of a program to
output a prediction of the next symbol A will transmit.  This
program---however large it is and however it might work inside---is nothing
more than B's model of A.  When B has no knowledge of A, this program is
essentially `empty'.  It contains no information, and can make no
predictions better some arbitrary limit (e.g., size_of_the_alphabet^-1). 
The program learns from each symbol transmitted by A, thus a good (and
portable) measure of the `size' of the program is how many symbols it has
seen.  Let us say that this program sees every symbol A ever transmits to B
(numbered from 1..n), and thus during it's life it will actually be n+1
different programs (numbered B0..Bn of size 0..n, respectively).

Imagine that you can ask any one of these programs to predict any symbol
from A.  Thus you could ask B3 to predict symbol 4 (exemplary of the normal
case) or you could ask B5 to predict symbol 1 (which it could, of course,
do perfectly, having already seen symbol 1).

Now we have a new definition of randomness.  A string is random with
respect to B if no program of B shorter than the string can predict it with
success greater than our arbitrary threshhold (which is typically defined
by the performance of B0).

If A is sending a passage from a well known book, and B `discovers' this
after receiving symbol 20 and can access the text of that book, B20
suddenly becomes a very good predictor of many future symbols.  The string
is not random.  But it _was_ random to B0, and B1 and perhaps less for each
successive symbol.  B20 is a different context than B0.  It has different
knowledge, different probabilities and therefore perceives a different
quality of randomness in A's message.  B20 is still only a program of
`size' 20 (i.e., you don't count the size of the book in B).  This is
easily demonstrated if you imagine what happens when A sends a message that
is a deterministic algorithm for producing a an infinite stream of symbols,
followed by the stream it generates.  If this algorithm requires i symbols
to express, then Bi is a perfect predictor for all subsequent symbols.  Bi
is clearly of size i (there is no external book for us to add to the size
of Bi).

In fact, no matter what message A sends, B considers it an algorithm for
generating predictions of future symbols.  Thus A is actually sending B a
sequence of programs (each a prefix of the next, and thus not
re-transmitted) B1, B2, ... Bn (but remember, these programs execute in the
context of B's knowledge... thus their predictions are not `universal'). 
This just brings our notion of programs, program length and prediction
around to the other side and lets us summarize:

 * A string is random with respect to B if the string itself is the
shortest program with which B can generate that string.

... or qualitatively

 * The randomness of a string Bn with respect to B is an inverse of the
quality of the predictions B can make of Bn from the strings B0...Bn-1.

We rely on the `relativity' of entropy.  Codes and cyphers can't function
without it.  The difference between your context and that of an attacker
(you know the key or codebook) is what makes the message meaningful only to
you (hopefully it will still have _some_ information you couldn't guess
before reading it).

Randomness is relative, thus there is no universal randomness measure for a
string, thus there can be no proof that a string is universally random. 
You can easily measure the exact entropy of a string with respect to a very
formally defined context (one where you can produce exact predictions). 
This is useful, but reveals nothing about the quality of the predictions a
different, even similar, context might make (Just one symbol is the
difference between B19 and B20 above; the string was random to one but not
the other), It reveals nothing about models we can't describe so perfectly
(like human thought).

 * There is no algorithm for deciding if a string is universally random.

In a less obvious leap, it is only by comparing the predictions of Bi with
Bk that a string of length j (i < j <= k) can be shown to be random with
respect to Bi.  Thus:

 * There is no algorithm shorter than the string itself for determining if
a string is random with respect to a given context.


Not exactly Q.E.D. but close enough for rock `n roll.


Scott Collins         | "Few people realize what tremendous power there
                      |  is in one of these things."     -- Willy Wonka
......................|................................................
BUSINESS.   voice:408.862.0540  fax:974.6094   collins@newton.apple.com
Apple Computer, Inc.   5 Infinite Loop, MS 305-2B   Cupertino, CA 95014
.......................................................................
PERSONAL.   voice/fax:408.257.1746    1024:669687   catalyst@netcom.com







Thread