1996-08-19 - search engine improvement

Header Data

From: bryce@digicash.com
To: cypherpunks@toad.com
Message Hash: 8727343f2d600093bba346218c6db8f53ac595b75af4f92c9d4b00f8ab0c4fb3
Message ID: <199608191931.VAA02971@digicash.com>
Reply To: N/A
UTC Datetime: 1996-08-19 23:39:22 UTC
Raw Date: Tue, 20 Aug 1996 07:39:22 +0800

Raw message

From: bryce@digicash.com
Date: Tue, 20 Aug 1996 07:39:22 +0800
To: cypherpunks@toad.com
Subject: search engine improvement
Message-ID: <199608191931.VAA02971@digicash.com>
MIME-Version: 1.0
Content-Type: text/plain



-----BEGIN PGP SIGNED MESSAGE-----

Keywords: distributed ratings systems, search engines, spiders, 
spiderspace, idea futures, The Shockwave Rider, John Brunner


You know there is a trick that might greatly improve the 
effectiveness of a search engine at almost no cost to the end
user.  It is the well-known heuristic of "If Person A likes X
and Y, and Person B likes X, then Person B probably likes Y.",
combined with passive polling (which is getting information
about people's opinions just by watching their actions, instead
of by asking them).


A first simple implementation would keep a table of the pages
that people choose, keyed from the query that they originally
submitted.  Those pages that people choose most frequently from
the list of matching pages (and/or those pages that people
"stop" on-- that they do _not_ follow by further searching),
would get bumped up a little in the list.


This would be massively expensive in networking, storage, and
computation, giving those hi-tech Alpha clusters at AltaVista
something to do...  :^)


There are plenty of extras and refinements that could be added
(for example, put some keywords identifying your "affiliation"
in a separate field.  It will only consider the results from
other people who entered the same affiliation keywords when
weighting your search results.).  And there are some good topics
for further discussion, such as is it worthwhile to distinguish
between "relevancy" and "value"?


I don't have a comprehensive list of people who are already
working on this area (distributed ratings) (if I did, I might 
have Cc:'ed them), but I know that many people are.  I hope that
they and the search engine people get together and make cool
stuff soon.


There is the interesting issue of whether this will cause
self-reinforcing "degeneration", where people (or an
"affiliation"-keyed group of people) accidentally overlook a
worthy page early in the game, and then, using each other's
behavior to influence their own, reinforce that mistake.


As a final attribution note:  John Brunner thought of this idea
idea in his prophetic novel _The Shockwave Rider_ in the 75.  
There is a wonderful line which I can't find right now, about 
how it turned out to be a flywheel instead of an oracle, merely 
aggregating human mistakes and successes.


Regards,

Bryce

P.S.  ObCryptoRelevance:  Um...  you could get paid for your
ratings using Chaumian ecash, and even have your ratings popped
into the right "affiliation" using Chaumian credentials...

P.P.S.  CryptoRelevance isn't very Ob anymore, is it?  Just as
well, IMESHO.




-----BEGIN PGP SIGNATURE-----
Version: 2.6.2i
Comment: Auto-signed under Unix with 'BAP' Easy-PGP v1.1b2

iQB1AwUBMhjA+EjbHy8sKZitAQGpIQMAyDcdHUgK9/KhNskvUG8AAbourl1Hg6J5
ZIzo7aTnDq3ZGN9RnqKRkBRRmk4hjN1rFFWvQUYtA3XQQl85scE2XVGG/oURBoTW
EU4WwB2oMSsAVGkYHn02B4gFn8gO6hmA
=tZn3
-----END PGP SIGNATURE-----





Thread