1996-07-09 - Re: Word lists for passphrases

Header Data

From: ichudov@algebra.com (Igor Chudov @ home)
To: ncognito@gate.net (Ben Holiday)
Message Hash: f28ac3873478bf0e8675f3003a7515c952b29712f2b9aca82614cbbfb0554077
Message ID: <199607090210.VAA07394@manifold.algebra.com>
Reply To: <Pine.A32.3.93.960708174618.18872A-100000@navajo.gate.net>
UTC Datetime: 1996-07-09 06:32:21 UTC
Raw Date: Tue, 9 Jul 1996 14:32:21 +0800

Raw message

From: ichudov@algebra.com (Igor Chudov @ home)
Date: Tue, 9 Jul 1996 14:32:21 +0800
To: ncognito@gate.net (Ben Holiday)
Subject: Re: Word lists for passphrases
In-Reply-To: <Pine.A32.3.93.960708174618.18872A-100000@navajo.gate.net>
Message-ID: <199607090210.VAA07394@manifold.algebra.com>
MIME-Version: 1.0
Content-Type: text

Ben Holiday wrote:
> If you have access to a shell, and to the news spool, you can generate
> some quick lists by hopping into the directory of any newsgroup that
> interests you and doing:
> cat * | tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq > my-big-ol-wordlist
> With most unixes that will generate an alphabetized list of all the unique
> words in your source text, converted to lowercase. I've had some problems
> with tr on a few machines, however. Adding a '-c' after 'uniq' will tell
> you how many times each word occured (useful for grepping out words that
> appear too infrequently, or too frequently) .. 

Actually I am fairly sure that your selection of words will be mediocre
at best. There are words (such as nethermost, insatiable, insufferable)
that are almost never used in news.

	- Igor.