1996-04-17 - Re: why compression doesn’t perfectly even out entropy

Header Data

From: abostick@netcom.com (Alan Bostick)
To: hoz@univel.telescan.com
Message Hash: faa4e50728edd9a45a6612ec052737c4489f4ab791e77481e8d5864c831ccdf0
Message ID: <l3Qdx8m9L0IU085yn@netcom.com>
Reply To: <199604161843.LAA19508@toad.com>
UTC Datetime: 1996-04-17 22:41:39 UTC
Raw Date: Thu, 18 Apr 1996 06:41:39 +0800

Raw message

From: abostick@netcom.com (Alan Bostick)
Date: Thu, 18 Apr 1996 06:41:39 +0800
To: hoz@univel.telescan.com
Subject: Re: why compression doesn't perfectly even out entropy
In-Reply-To: <199604161843.LAA19508@toad.com>
Message-ID: <l3Qdx8m9L0IU085yn@netcom.com>
MIME-Version: 1.0
Content-Type: text/plain


In article <199604161843.LAA19508@toad.com>,
rick hoselton <hoz@univel.telescan.com> wrote:

> At 10:07 AM 4/16/96 -0400, Perry E. Metzger wrote:
> >...Usually, random
> >sequences are non-compressable, but it is possible (though very
> >improbable) for Hamlet to appear out of a random number generator,
> >and it is of course quite compressable...
> But even if it came from a completely random source, it would 
> still make a bad one-time pad.  When people say "compressable" 
> or "algorithmic complexity" or "random", a context is always implied.  
> In the context of "fair coin flips" the text of Hamlet is NOT compressible.
> Because no string is more likely than any other.  Any algorithm that could 
> compress that string, will, on the average INCREASE the length of 
> "fair coin flip" strings it tries to compress.
> Under the context of "pads that might be used for cryptographic purposes" the 
> text of Hamlet is quite compressible.  An attacker is much more likely to 
> test for such a stream than one that appears more random.  So, even if you 
> got "Hamlet" from a perfectly random source, you should reject it for crypto 
> purposes.

This thread is becoming isomorphic to one that took place on the
Coderpunks list.  Jonathan Wienke was promoting an idea to make the
output of a PRNG "more" random by throwing away output whose statistics
didn't match the ideal statistics of an ideal RNG.  Critics of this
scheme (including Perry) argued along these lines:

Suppose you think that quotes from Hamlet don't belong in your OTP
keystream, and so you filter them out.  In doing so, you are making your
keystream *less* random, not more, because you are making some bit
sequences more likely than others.

Given that Hamlet quotes aren't very likely, you aren't making your
keystream very much weaker, but you *are* weakening it.  

See the Coderpunks archives for more details on this argument.

- -- 
Alan Bostick               | They say in online country there is no middle way
mailto:abostick@netcom.com | You'll either be a Usenet man or a thug for the CDA
news:alt.grelb             |    Simon Spero (after Tom Glazer)

Version: 2.6.2