1997-08-08 - a decentralised dejanews

Header Data

From: Adam Back <aba@dcs.ex.ac.uk>
To: cypherpunks@cyberpass.net
Message Hash: cf8472ef6c28f01ed9976fe5b8dc1f3e1a7f50089ab85ab8e4d641673e3ddb77
Message ID: <199708081640.RAA00985@server.test.net>
Reply To: N/A
UTC Datetime: 1997-08-08 17:54:46 UTC
Raw Date: Sat, 9 Aug 1997 01:54:46 +0800

Raw message

From: Adam Back <aba@dcs.ex.ac.uk>
Date: Sat, 9 Aug 1997 01:54:46 +0800
To: cypherpunks@cyberpass.net
Subject: a decentralised dejanews
Message-ID: <199708081640.RAA00985@server.test.net>
MIME-Version: 1.0
Content-Type: text/plain




As you'll know if you read some of the previous eternity verbiage I've
generated over the last few days, the design is evolving as ideas are
bounced around.

To recap briefly:

  The eternity implementation at the moment is performing the function
  of a cache of USENET, which also `reformats' messages to extract and
  deliver the content as web documents.

  The cache has no replacement policy.  Planned replacement policies
  where least accessed documents, and/or the document accompanied by
  the smallest ecash payment.

Obvious problem:

  The cache will fill up and material will be discarded when the
  eternity document store grows too large for one machine to cache.

Solution:

  My last thoughts on how to fix this was to have collaboration
  between the eternity servers: they ask each for documents they
  don't have.  They act like a distributed raid file server with
  redundancy.

  Clearly with the "least accessed" replacement policy they'll all
  likely cache similar material, and documents will still fall off the
  cache.

  Ecash payment for cacheing might work reasonably well, if you buy as
  much redundancy as you wish by including cash for several servers.

  Or perhaps a random cache replacment policy is the simplest to keep
  even spread of availability of documents.

  k of n secret sharing could be used at some overhead so that no one
  site is responsible for serving a given document.

  However it is fairly complex to design and implement such a system
  which can provide full underdisturbed service with nodes dynamically
  leaving and joining, and it is also difficult to protect fully
  against denial of service attacks.

Recap on old idea of using dejanews and altavista

  You will recall that earlier discussions talked about using dejanews
  or altavista as archives to fetch documents from.  I rejected this
  initial idea because altavista and dejanews take steps to reduce
  their disk space usage, by stripping out UU and radix-64 encoded
  documents.  Now we can fool them easily enough by waging a running
  battle of increasingly sophisticated textual stego on our side vs
  playing catch up with cleverer eternity document spotters on their
  part.  We might even win long term if we public key encrypted the
  documents for specific eternity servers.  Kind of messy.  Old
  content will be disappearing if they go back and filter out old with
  the new filters.

  However is another important reason not to use altavista or dejanews:
  it's not decentralised.  If something sufficiently hot gets
  published they'll both be taken off line long enough to figure out
  how to disable access to eternity.  Even if they can't do it
  ultimately because of sufficiently advanced public key text
  steganography, it'll disrupt the service.


Viewing the situation more modularly:

  The eternity service with a distributed raid file server formed from
  the respective caches has one problem: of acceptability.  Each
  server is actually holding eternity material, it tries to get away
  with this by claiming that it is a cache.  However each server is
  serving three potentially separable functions:

    - document cache

    - node in a distributed news archival service, but only for
      messages which look like eternity articles

    - a view on the database to allow it to be viewed interactively as
      web documents (fetching and translating decoding eternity
      documents to serve as web pages)

  The first function -- a document cache could probably just as well
  be served by a standard proxy cache.  Not a big issue, it's easy
  either way.

  The second function makes it hard to claim you are `just cacheing
  USENET' -- you're also creating a distributed archive of a
  very particular subset of it.

  And the third function, viewing the database, is somewhat like a
  specialised search engine.  The search engine functionality is 100%
  replicatable at practically zero storage overhead, and pretty good
  as a non-censorable element of the system.  100% redundancy of
  service each node has nothing that can't be replaced. 

A distributed deja-news replacement

  So after separating the functionality it becomes apparent that we
  could have a another model.  Actually try to start a distributed
  deja-news replacement, an complete eternal record of _everything_
  that _ever_ gets posted to news.  Dejanews is kind of doing that,
  except that it's not complete -- they are stripping out uuencoded
  files, etc.

  Also this would be distributed, and harder to coerce.

  So what are the storage requirements?  What does dejanews have in
  their machine room?  (Huge raid server?  How many Gigs?).

  I'm wondering if 1000 academic nodes, small and large ISP nodes, and
  indivduals each contributing 1Gb each could out perform deja-news.
  That'd be 1Tb.  How long would 1Tb last at the being consumed at a
  rate of a full USENET feed?  As disks get cheaper, next two years or
  so, people can upgrade disks to 10Gb disks or whatever is cheap at
  the time.

  You would want some redundancy, more than redundancy than dejanews
  needs in their raid fileserver, because nodes may come and go, and
  the number of nodes could fluctuate.

On top of distributed news archive, build an eternity service

  So with the eternal news archive, you now have everything you need
  to easily build an eternity server.


How practical is this?

  How much interest is there likely to be in creating a full USENET
  archive as a distributed net based effort.  It's clearly got vastly
  larger storage requirments than storing the full set of eternity
  articles posted to USENET.  However it has wider uses, and has lots
  of innocuous reasons to exist.  I guess it's quite an interesting
  project in it's own right.  You could start with just some
  newsgroups, and build up as a way to boot strap it.  But it may be
  harder to generate enough interest to deploy this than to deploy
  enough eternity servers to do build a distributed archive of just
  eternity articles.

How soon could it happen?

  Who knows.  What do people reckon the interest would be in a
  distributed complete USENET archive for it's own sake.  For purposes
  of the archive cancels would not be applied.  Indeed archive the
  cancels too, and the control messages for everyone to see for
  perpetuity who did and said what.

Best way forward?

  Perhaps it will be more realistic to build a distributed archive for
  eternity documents only first.  If it is designed as a separate service,
  then it could be the same software to do either, just restrict it to
  alt.anonymous.messages, or just for alt.anonymous.messages passing a
  filter program.

Comments?

Adam
-- 
Have *you* exported RSA today? --> http://www.dcs.ex.ac.uk/~aba/rsa/

print pack"C*",split/\D+/,`echo "16iII*o\U@{$/=$z;[(pop,pop,unpack"H*",<>
)]}\EsMsKsN0[lN*1lK[d2%Sa2/d0<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<J]dsJxp"|dc`






Thread