1994-01-26 - Re: Randomness of a bit string

Header Data

From: tcmay@netcom.com (Timothy C. May)
To: cypherpunks@toad.com
Message Hash: d7c19a421a51a02ad53f01851c92127f288a34b350f2637de6d48beb87a9f3e7
Message ID: <199401260428.UAA04099@mail.netcom.com>
Reply To: <9401251931.AA19525@churchy.gnu.ai.mit.edu>
UTC Datetime: 1994-01-26 04:46:52 UTC
Raw Date: Tue, 25 Jan 94 20:46:52 PST

Raw message

From: tcmay@netcom.com (Timothy C. May)
Date: Tue, 25 Jan 94 20:46:52 PST
To: cypherpunks@toad.com
Subject: Re: Randomness of a bit string
In-Reply-To: <9401251931.AA19525@churchy.gnu.ai.mit.edu>
Message-ID: <199401260428.UAA04099@mail.netcom.com>
MIME-Version: 1.0
Content-Type: text/plain


(I'm gonna take a breather on this "randomness of a bit string" thread
after sending this post off. I agree with what many folks have
written, and was especially glad to see Scott Collins' nice summary
earlier today about the difficulties in describing randomness. It's a
fascinating topic, with even some practical consequences for
Cypherpunks....maybe.) 

Ray Cromwell writes:

> Tim writes:
> >But can he ever say "I can prove the number is random"? No. There's
> >always some chance an even-cleverer puzzle solver will find the
> >pattern, the key that unlocks the randomness. For example, most
> >ciphertexts pass nearly all statistical tests for randomness, "look"
> >random, and even _act_ like random numbers (recall the Blum-Blum-Shub
> >pseudorandom number generator and how good it is). But simple
> >application of the key turns the seemingly random
> >"100010001010110010101" into "ATTACK."
> 
>    But can we say that "100010001010110010101" has been ``compressed''
> into "ATTACK"? How do we know? Let IC(x) stand for the amount of information

Let me first point out that _any_ string can be "compressed" into
"ATTACK" with the right mapping. My house could be stormed my Reno's
Raiders and the number 100010001010110010101 subjected to thorough
scrutiny at the Fort. Lo and behold, they could find the string which
when applied to my string (by some process) outputs "ATTACK."

There are some subtle issues of "relevance" that need to be addressed.
As an example, if a number written down somewhere in my house produces
the transformation into "ATTACK," that's presumably of more relevance
than if the NSA finds some number lying around (and of course they can
_construct_ such a number easily). I'm sure cryptanalysts take such
things into account, but formal theories don't seem to have addressed
this (but I may just be unaware of papers along these lines). And
certainly the courts have yet to touch on this issue, so far as I
know.

Scott Collins nicely summarized the difficulties in calling any number
random (echoing the points I was making, perhaps less formally), and
Phil Karns was right when he said "Randomness is in the eye of the
beholder." (He may've been making an ironic point about my arguments,
but he was still right.)

Back to Ray's point:

> storage used by x. Is 
> 
>      IC(100010001010110010101) > IC(ATTACK) + IC(key) + IC(algorithm)?
> 
>    It is not at all clear that this relationship would hold. (in fact,
> I don't think it will even begin to work out unless the cyphertext
> is much longer than the plaintext) So in fact, cryptorandom numbers
> can be considered incompressible if you take into account the algorithm
> required to perform the operation -- just as if I had used a 100 terabyte 
> dictionary to compress via lookup, or better yet, a one time pad.

Yeah, but the complexity of the algorithm, and the "CPU effort" needed
to mount the analysis is not considered part of "Kolmogorov
complexity." That's just the formalism. Since the effort is indeed
important (e.g., the complexity of DNA strings, for example, gives
evidence that many billions of years of compression, massaging, more
compression, etc. happened), others have developed measures of
complexity which take into account the effort, the CPU cycles, if you
will.

Greg Chaitin first looked at this in 1966, but it was left to fellow
IBM researcher Charles Bennett (whom Cypherpunks may know as the
coinventor with Gilles Brassard of "quantum cryptography," and also a
pioneer in reversible computation) to label the idea "logical depth"
and explore the ramifications more deeply (pun intended).

Logical depth addresses the issues Ray is raising. A good summary is
in "The Turing Machine: A Half-Century Survey," edited by Rolf Herken,
and published in about 1991.

> All of this is meaningless anyway. Information theory was proven wrong
> by WEB technologies when they invented a compression program that can
> recursively compress any input data down to 64k. Harddrives are now
> obsolete.

Yes, as Perry Metzger once showed on this list, even the longest of
posts can be compressed into the period at the end of this sentence.


--Tim May


-- 
..........................................................................
Timothy C. May         | Crypto Anarchy: encryption, digital money,  
tcmay@netcom.com       | anonymous networks, digital pseudonyms, zero
408-688-5409           | knowledge, reputations, information markets, 
W.A.S.T.E.: Aptos, CA  | black markets, collapse of governments.
Higher Power:2**859433 | Public Key: PGP and MailSafe available.




Thread