1992-10-26 - Re: entropy measures

Header Data

From: Eric Hollander <hh@soda.berkeley.edu>
To: Eric Hughes <hughes@soda.berkeley.edu>
Message Hash: b71072e5de22f0ee2513a7bc057b82b39ad7284a70bf4913906340f5c39b27b3
Message ID: <9210260440.AA00199@soda.berkeley.edu>
Reply To: <9210240620.AA08036@soda.berkeley.edu>
UTC Datetime: 1992-10-26 04:41:19 UTC
Raw Date: Sun, 25 Oct 92 21:41:19 PPE

Raw message

From: Eric Hollander <hh@soda.berkeley.edu>
Date: Sun, 25 Oct 92 21:41:19 PPE
To: Eric Hughes <hughes@soda.berkeley.edu>
Subject: Re: entropy measures
In-Reply-To: <9210240620.AA08036@soda.berkeley.edu>
Message-ID: <9210260440.AA00199@soda.berkeley.edu>
MIME-Version: 1.0
Content-Type: text/plain



>uuencoding will have a slightly lower single-character entropy than
>the ASCII armor PGP uses because just about every line begins with the
>letter 'M'.  This will skew the distribution slightly.  But a better
>way of distinguishing uuencoding and ascii armor is to see that in
>falls in the same entropy class, and then just looking at the
>alphabetic subsets used.

It's not that simple.  The entropy of a byte is the number of bits needed to
represent it.  If what is uuencoded is extremely repetitive, the entropy
will be low, maybe even less than one.  On the other hand, if it were random
data, it would just be slightly lower than ascii armor.  Binaries are
somewhat repetitive, so they have somewhat less entropy than random data.
English has a lot of redundancy, so it has a low entropy.

e





Thread