1996-01-16 - Spiderspace

Header Data

From: tcmay@got.net (Timothy C. May)
To: cypherpunks@toad.com
Message Hash: 223a84d94b40fa0857aedbb0bd01130588e005338981c16494fc26324ca883bb
Message ID: <ad2127120b02100482d3@[205.199.118.202]>
Reply To: N/A
UTC Datetime: 1996-01-16 18:50:16 UTC
Raw Date: Wed, 17 Jan 1996 02:50:16 +0800

Raw message

From: tcmay@got.net (Timothy C. May)
Date: Wed, 17 Jan 1996 02:50:16 +0800
To: cypherpunks@toad.com
Subject: Spiderspace
Message-ID: <ad2127120b02100482d3@[205.199.118.202]>
MIME-Version: 1.0
Content-Type: text/plain



I've been thinking a lot about the problems and opportunities that are
coming up as more and more "spiders" (Web searchers, crawlers) are indexing
directories and files on systems they can find.

For the sake of this post, the files and whatnot these spiders and
super-spiders can hit constitute a universe I'll call "spiderspace," as it
semi-euphoniously matches cyberspace and cypherspace.

Two things caused me to think more intensely abou this:

1. At the Saturday Cypherpunks physical meeting, Marianne Mueller (I think)
was telling me about an experience where an old letter she'd written to
someone showed up in an Alta Vista search. A personal letter, that is. How
this happened was that the letter to her friend was buried several
subdirectories deep in a directory he made accessible to the outside world.
Presto, Alta Vista found it, indexed it, and made it keyword-searchable!

(Humans are pretty bad at doing such meticulous file prep work, but
all-seeing spiders are very good at seeing everything.)

2. Someone on the Cyberia-l list, Mike Godwin in fact, asked if anyone had
a particular post he'd written last summer, a post he'd neglected to save
but that he needed. I had not kept that post, according to my own archives,
but I decided to see what Alta Vista might turn up. (The Cyberia-l list is
not officially archived, and I believe archives of it are discouraged by
the list owner, for various reasons especially worrisome to lawyers and law
professors!)

Sure enough, a search of "Cyberia-l" in Alta Vista showed all sorts of
hits, including what appeared to be several _private archives_ of parts of
the traffic. (By "private" I mean in the sense that they were someone's
personal archives, and not necessarily complete or even semi-officially
sanctioned.)

And a search of "Cyberia-l AND Godwin AND parental AND Ferber" (some of the
keywords in the post he knew he was looking for) produced two hits, most
probably of the post he was seeking. (They were on a Kent Law School
archive site that, I believe, is no longer accessible to the outside...the
Alta Vista spiders must have gotten to it and indexed it before the site
was made less accessible...just a thought.)

This fits with the point made above, that increasing numbers of odd
things--letters, love letters, resumes, job applications, even things like
PGP passwords!--will likely show up by accident in spiderspace.

I've started to look for things like PGP files laying around buried in
subdirectories. I can imagine attacks based on this.

Declan McCullagh, on the Cyberia-l list, followed up to my post on this
topic by noting that things will really get interesting when the internal
file systems of many sites are made searchable, such as with the Andrew
File System (AFS) at CMU and elsewhere. Apparently most users make their
directories accessible to others.

Implications for Cypherpunks?

First, an alert for you to be very careful about what you make accessible
to the outside world. It's no longer just a matter of people taking the
time to rummage through your subdirectories, it's now trivial to find
things with the new Web search engines.

Second, what is out there in spiderspace is incredibly useful for building
dossiers, for compiling correlations, and for doing competitive analyses.

Third, more and more kinds of files are going into spiderspace. This may
include files compiled by others, such as files containing Web accesses!
(All it takes is for someone to keep a record of site accesses,
subscriptions, etc., and then put record in a searchable place: it then
becomes trivial to search on a name and find out interesting things.)

Fourth...left to your imagination.

--Tim May

We got computers, we're tapping phone lines, we know that that ain't allowed.
---------:---------:---------:---------:---------:---------:---------:----
Timothy C. May              | Crypto Anarchy: encryption, digital money,
tcmay@got.net  408-728-0152 | anonymous networks, digital pseudonyms, zero
W.A.S.T.E.: Corralitos, CA  | knowledge, reputations, information markets,
Higher Power: 2^756839 - 1  | black markets, collapse of governments.
"National borders aren't even speed bumps on the information superhighway."









Thread