1994-11-27 - WWW “remailers”

Header Data

From: Hal <hfinney@shell.portal.com>
To: cypherpunks@toad.com
Message Hash: afa724e409c5109d1bf302fbea7a98fa5e09b3e51416f7ec5d2aebeb40e440e0
Message ID: <199411270704.XAA21510@jobe.shell.portal.com>
Reply To: N/A
UTC Datetime: 1994-11-27 07:04:49 UTC
Raw Date: Sat, 26 Nov 94 23:04:49 PST

Raw message

From: Hal <hfinney@shell.portal.com>
Date: Sat, 26 Nov 94 23:04:49 PST
To: cypherpunks@toad.com
Subject: WWW "remailers"
Message-ID: <199411270704.XAA21510@jobe.shell.portal.com>
MIME-Version: 1.0
Content-Type: text/plain


We have had some discussions here about privacy of accesses on the World
Wide Web.  Presently servers get a variable amount of information about
the people accessing their sites, depending on the particular software
being used and how it is configured.  This is potentially harmful to the
privacy of WWW users in that their access information can be recorded,
etc.  Here are some things you can do to reduce this problem.

First, try connecting to:

   http://www.uiuc.edu/cgi-bin/printenv

This just displays environment variables, which shows what information
about you is being received by servers.  Look particularly at the lines
reading HTTP_FROM and REMOTE_HOST.  These may contain your user name and
computer address.

You may be able to remove your user name information.  Some clients,
including, I am told, NetScape and version 2 of Mosaic for Mac/Windows,
allow you to set your email address, which is handy, but then they send
it along to servers, which is harmful to your privacy.  You might want to
consider not setting this field and using other programs for sending
mail.  Also if people complain about this then perhaps the makers of this
software will add an option to suppress sending the info.

Even if you don't see your name in HTTP_FROM it still may be possible
for somewhat more sophisticated programs to log your access if the
REMOTE_HOST information is correct and you are running on a Unix system
or something similar.  This is done via the identd service if that is
running on your computer.  The server can use this service to ask for
your user name once you are connected.  One way to see if identd is
running on your computer is to telnet to your own computer on port 113
and see if anything is there (telnet <your-computer-name> 113).  If so
then this is potentially another privacy exposure.

I have recently been experimenting with using "proxy servers" to remove
even the REMOTE_HOST information from the server's view.  Proxy servers
are servers which basically receive WWW connections and pass them along.
Then when the data comes from the remote site they pass it back to the
originating user's site.  Because the proxy server is in the middle the
remote site never sees the host name of the originating user.  In this
respect they are somewhat similar to our cypherpunk remailers, hence the
title of this article.

(The purpose of proxy servers has nothing to do with this function; they
are designed to allow easy WWW access from users who are on firewalled
sites.  But they happen to serve our purposes as well.)

Interestingly, the standard nntpd (nntp daemon, the master server which
runs on a site which offers web pages) from CERN includes proxying
capability automatically!  All you have to do is to add a few lines to the
configuration file.  If this idea proves sound, perhaps some people
running nntpd will enable proxies and serve as "remailer operators of the
web".

Normally proxy servers are configured to pass connections only from the
machines they are there to serve (at least, they can be configured that
way; I don't actually know how careful people are about this).  But
luckily I have found that the CERN proxy server itself accepts
connections from anybody (at least, it accepts them from me!).  So this
is useful for doing experiments.

And, the great part is, almost all web clients are set up now for proxy
support.  The way you enable it varies from client to client.  I believe
most of the Mac and Windows clients have a preferences box which allows
you to put in the address of your proxy server.  On Unix, you can set
environment variables.  Here is the suggestion from the web page at
CERN:

    #!/bin/sh
    http_proxy="http://www.cern.ch:911/";   export http_proxy
    ftp_proxy="http://www.cern.ch:911/";    export ftp_proxy
    gopher_proxy="http://www.cern.ch:911/"; export gopher_proxy
    wais_proxy="http://www.cern.ch:911/";   export wais_proxy
    exec Mosaic

This is a little shell script which runs Mosaic, first setting four
environment variables to "http://www.cern.ch:911/", which is the proxy
server I was referring to, the one which accepts connections from the
rest of the world.

For the purpose of the experiment, only http_proxy needs to be set.
Try setting that one and then run lynx or mosaic on your unix
workstation, and connect to the printenv URL above.  Compare the
information that is shown from what you got earlier without the
environment variable.  Similarly, on other machines, try the printenv
test with and without proxy serving enabled using the CERN proxy.
I find that the proxy server does in fact prevent the remote site from
seeing my computer's address, and without that the IDENTD can't be used
to reveal my name.

This technique has many ramifications.  For example, if a US proxy server
were available, ftp could be done via Mosaic to sites which only allowed
connections from American computers.  People have been talking about
writing special IP redirectors for this, but here it turns out the
capability has been around all along.

I got my information about proxies by reading:
http://info.cern.ch/hypertext/WWW/Proxies/.  Specific information on
configuring CERN nntpd as a proxy server is in:
http://info.cern.ch/hypertext/WWW/Daemon/User/Proxies/Proxies.html.

Modifications to the proxy server code would be necessary to provide some
additional features, such as support of encryption between user and proxy
server (via the SHTTP protocol extensions, perhaps; this way you could
get local privacy even when connecting to servers which did not support
encryption), or possibly chaining of proxies.  I think this is a fertile
area for discussion and further work.

Hal





Thread