1996-01-17 - Re: Spiderspace

Header Data

From: Bill Stewart <stewarts@ix.netcom.com>
To: cypherpunks@toad.com
Message Hash: cb5be211c1681632c052d511c70064cd8c3e10411eda2db889820bfe3d06e8de
Message ID: <199601170613.WAA08830@ix5.ix.netcom.com>
Reply To: N/A
UTC Datetime: 1996-01-17 14:15:44 UTC
Raw Date: Wed, 17 Jan 1996 22:15:44 +0800

Raw message

From: Bill Stewart <stewarts@ix.netcom.com>
Date: Wed, 17 Jan 1996 22:15:44 +0800
To: cypherpunks@toad.com
Subject: Re: Spiderspace
Message-ID: <199601170613.WAA08830@ix5.ix.netcom.com>
MIME-Version: 1.0
Content-Type: text/plain


>... I was under the impression that the only documents that most web crawlers
>will search are documents that are link-accessible.  Are you saying that this
>isn't true?  Are you saying that Alta-Vista will search EVERYTHING that's
>publicly accessible, whether by anonymous FTP or web?

Don't archie servers already pick up the anonymous ftp fairly well?
Also, aside from no-robots conventions, you can build a cgi program for
access to files that might be more effective at blocking searches
while still preserving access.

Also, it wouldn't be hard for a web-crawler to follow ftp links,
as long as the root of an anon-ftp site is pointed to by a URL somewhere.
#--
#				Thanks;  Bill
# Bill Stewart, stewarts@ix.netcom.com, Pager/Voicemail 1-408-787-1281
#
# "Eternal vigilance is the price of liberty" used to mean us watching
# the government, not the other way around....






Thread