1995-11-22 - Re: Repeated Words/characters in Password/Phrase

Header Data

From: Scott Brickner <sjb@universe.digex.net>
To: tcmay@got.net (Timothy C. May)
Message Hash: e8a1f5a726f863c6fb8a25df0ac906f6a278a4130feea1d2d8fc6b6b7eb902bf
Message ID: <199511222102.QAA10377@universe.digex.net>
Reply To: <acce4e6b00021004573a@[205.199.118.202]>
UTC Datetime: 1995-11-22 21:43:33 UTC
Raw Date: Thu, 23 Nov 1995 05:43:33 +0800

Raw message

From: Scott Brickner <sjb@universe.digex.net>
Date: Thu, 23 Nov 1995 05:43:33 +0800
To: tcmay@got.net (Timothy C. May)
Subject: Re: Repeated Words/characters in Password/Phrase
In-Reply-To: <acce4e6b00021004573a@[205.199.118.202]>
Message-ID: <199511222102.QAA10377@universe.digex.net>
MIME-Version: 1.0
Content-Type: text/plain


Timothy C. May writes:
>At 11:11 PM 11/14/95, Ted Cabeen wrote:
>>Do repeated words in a PGP passphrase make the pass phrase less secure than
>>a passphrase without any repeated words?  And on the same note, do repeated
>>letters in a UNIX password make that password easier to break? I can't seem
>>to find anything in my books on cryptography that mention this.  Thanks.
>
>More of an information theory question than a crypto question. There are no
>simple answers to this question, but some examples will help:
>
>The password "foo" is not very good, and "foofoo" is only slightly better.
>And "foofoofoo" is slightly better, and so on, to a point. But
>"foofoo....foo" is not N times better than a single "foo," because the
>_pattern_ is simply desribed: "repeat "foo" N times." Thus, the information
>content or entropy of "foofoofoo....foo" is not N times greater than the
>entropy of "foo."
>
>A some dictionary attacks which would trivially find "foo" will not find
>"foofoo," or "foofoofoo," etc., so this could be a great help. More
>sophisticated dictionary attacks may of course take the 30,000 or so most
>common names, words, places, and then do various permutations, reversals,
>repetitions, etc.
>
>So this is why there is not likely to be a simple answer to your question.
>Repeating words in a passphrase can make the passphrase easier to remember
>(such as "thequickquickbrownfox") and make certain kinds of attacks harder,
>but with not as much of an increase in entropy at the increased number of
>raw characters might otherwise suggest.
>
>Other "heuristics" (simple rules of thumb) for passphrases are contained in
>the PGP documents, and in numerous other places: avoid names, add
>nonstandard English keyboard characters liberally (even if using real
>words), etc. The "best" passphrases, it almost goes without saying, are the
>longest and most "unpredictable," so that "7f#qp)djQ10hB%3t+1?U4SVp5" is
>much superior to "%foo%foo".

I don't buy this argument.  The only reason "foofoo" could have less
entropy than "foobar" is if the attacker had some reason to know that
the user tends to choose doubled passwords, or something like that.

If the user has historically chosen passwords with roughly six bits of
entropy per character, then "foofoo" is exactly as likely as "foobar",
and is no "weaker" from an information-theoretic perspective.

In fact, information theory would generally note that discarding the
"foofoo" choice slightly reduces the entropy in the password.

It is also worth noting that any good password algorithm doesn't permit
one to determine if the password is _partly_ right, so entropy
measurements can't really meaningfully be made on a per-character
basis, only on the password as a whole.

It is because the attacker knows that many (if not most) users tend to
prefer passwords that are "easier to remember" that leads him to try
the more memorable combinations *first*.  The information-theoretic
interpretation of this is that such memorable passwords have less
entropy than the others, because the probability that the next account
an attacker tries to guess uses a memorable password is higher than the
probability that it doesn't.

"foobar" occurs as a password less frequently than "foofoo", so it has
more entropy.  The extra entropy didn't come from the use of more
characters, it came from all the more lazy users who like "foofoo"
better.

To use a variant of Tim's example, "7f#qp)djQ10hB%3t+1?U4SVp5" is not
measurably better than "7f#q#)d#Q10h#%#t+1#U4S#p5", even though the
latter uses the "#" character much more frequently than the first.
Both passwords are so far down the list that they probably have never
occurred as passwords.  Both contain effectively the same entropy.

To address the original question:
>>Do repeated words in a PGP passphrase make the pass phrase less secure than
>>a passphrase without any repeated words?

Probably not.  It may even increase security, as "the quick brown fox"
is more frequently used than "the quick quick brown fox" as someone's
password, and should, therefore, be tried first.

>>And on the same note, do repeated
>>letters in a UNIX password make that password easier to break?

Again, probably not.  If the letters are generally chosen at random,
then "abafraa" is just as likely to occur as "abifryu".  If the letters
are chosen less randomly, like from a name, then "anna" is more likely
than "xavier", but less likely than "john".





Thread