1993-12-31 - Benford revisited:

Header Data

From: jpinson@fcdarwin.org.ec
To: cypherpunks@toad.com
Message Hash: 6de2dec089bb72408f0c8aedc524853eb53b643971d4c1362ef05e15787ed038
Message ID: <9312302255.aa24644@pay.ecua.net.ec>
Reply To: N/A
UTC Datetime: 1993-12-31 05:20:44 UTC
Raw Date: Thu, 30 Dec 93 21:20:44 PST

Raw message

From: jpinson@fcdarwin.org.ec
Date: Thu, 30 Dec 93 21:20:44 PST
To: cypherpunks@toad.com
Subject: Benford revisited:
Message-ID: <9312302255.aa24644@pay.ecua.net.ec>
MIME-Version: 1.0
Content-Type: text/plain


In regards to my posting about the New Scientist article on
"Benford's" law (unexpected number ratios in observed data), I 
received many comments along the line of:

> I suggest you look at any slide-rule or log graph paper for the
> simple explanation of this trivial phenomenon.


Actually, the New Scientist article gave a very clear explanation 
of Benford's law, and how it can be a simple artifact of the way
numbers are counted.

The article caught my attention,  when it went on to explain how 
other phenomena, such as alpha particle half-lifes can follow 
Benford's law for completely different reasons.

I won't attempt to retype the entire article, but Buck, Merchant, 
and Perez, reported in the  European Journal of Physics (Vol 14, 
p59) that both expected and observed half-lives followed 
Benford's law.

Quoting New Scientist: "..Why should nature operate in this 
curious manner? Buck and his colleagues point out that in the 
case of alpha decay, there may be a good physical reason.  In 
1928 George Gamow used the newfangled theory of quantum mechanics 
to describe alpha decay.  The alpha particle lives inside a 
"potential well" in which the energy is lower than in an any 
other state that it can normally reach.  It escapes by 
"tunneling" through the potential well.  The probability of 
tunneling occurring within a given time is found by forming a so- 
called "tunneling integral" and raising it to a power that is 
proportional to time.  So tunneling times, and hence half-lives 
naturally correspond to a geometric series, not a arithmetic one. 
If nature chooses the tunneling integral "randomly" with uniform 
probability, then the power-law dependence of alpha-particle 
half-lives on this integral leads to scale-invariant behavior, 
and so to Benford's law.  "

I am not a physicist or a mathematician, so I can't speculate on 
the validity of these statements.

However, all this did make me wonder if advances in fractal 
geometry, or chaos theory, might one day reveal patterns in 
seemingly random numbers derived from real word phenomena.

Since most random numbers used in cryptography are derived from 
real word phenomena,  we should be aware of such potential 
influences.

Actually, we don't have to bring in fractals, chaos, or Benford's 
law to have reason for concern about using real world based 
random numbers.

A method commonly recommended on this list for generating random 
numbers is to take the audio input from a workstation (with no 
mike attached) and then compress the output:

cat /dev/audio | compress - >random-bits-file

First of all, without extensive study of the audio output (from 
the *specific* source) who can say the output is random in the 
first place?  Harmonics from the various electrical circuits in 
the computer could easily produce non-random effects (like a 60 
cycle hum).

The second aspect that concerns me is the use of compression. Unless
you are very familiar with the type of data compression being used,
how can you be sure there aren't various signatures, tables, data
structures, or headers being inserted into the output.

Indeed, I had to go no further that the beginning of a data file
produced by the "Compress" program, to find nonrandom numbers.  All
such files begin with the same three bytes!  If something so obvious
is being overlooked, what about more subtle patterns?

Anyway, it seems to me there can be many factors "contaminating" 
random numbers based on physical phenomena, factors ranging from 
a simple 60 cycle hum to unrecognized fractal patterns.

It is perhaps a good idea to routinely "massage" random numbers 
to remove both recognized and unrecognized contamination.

Xor'ing random numbers derived from two or more different sources 
might be sufficient.  If done properly, such a procedure may 
produce non-Benford, evenly-distributed random numbers, which 
might be considered superior to the original.

Actually, I tend to be rather cautious about such things and am 
always on the lookout for possible "contamination".  I also try 
to keep an open mind to new ideas.

Jim Pinson 





Thread