From: m5@dev.tivoli.com (Mike McNally)
To: ecarp@tssun5.dsccc.com (Ed Carp @ TSSUN5)
Message Hash: ff5e91bcaceceb8cbd0d6e000ff13a488455ce6733b437aa957d2430037b34f1
Message ID: <9601161922.AA13227@alpha>
Reply To: <9601161853.AA13284@tssun5.>
UTC Datetime: 1996-01-16 21:10:51 UTC
Raw Date: Wed, 17 Jan 1996 05:10:51 +0800
From: m5@dev.tivoli.com (Mike McNally)
Date: Wed, 17 Jan 1996 05:10:51 +0800
To: ecarp@tssun5.dsccc.com (Ed Carp @ TSSUN5)
Subject: Re: Spiderspace
In-Reply-To: <9601161853.AA13284@tssun5.>
Message-ID: <9601161922.AA13227@alpha>
MIME-Version: 1.0
Content-Type: text/plain
Ed Carp writes:
> ... I was under the impression that the only documents that most web crawlers
> will search are documents that are link-accessible. Are you saying that this
> isn't true? Are you saying that Alta-Vista will search EVERYTHING that's
> publicly accessible, whether by anonymous FTP or web?
Ah, but if it hits a site that's set up with a top-level directory
which *does* contain an "index" page but whose server *doesn't*
recognize the index page name, then when you hit the site you
(probably) get one of those server-generated indices. Those things
generally have *everything* in the directory visible (except those
files blocked by the server configuration, usually stuff like emacs
temp files), and so there you go...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Nobody's going to listen to you if you just | Mike McNally (m5@tivoli.com) |
| stand there and flap your arms like a fish. | Tivoli Systems, Austin TX |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Return to January 1996
Return to “m5@dev.tivoli.com (Mike McNally)”