1997-04-09 - Re: Anti-Spambot: what algorithm should be used?

Header Data

From: dlv@bwalk.dm.com (Dr.Dimitri Vulis KOTM)
To: freedom-knights@jetcafe.org
Message Hash: 78bd5c1a185b4b6e9cd0a32fe2568e74090654ce1aacd607d8cbf558e6b10abc
Message ID: <wVcV5D93w165w@bwalk.dm.com>
Reply To: <199704082129.QAA03372@manifold.algebra.com>
UTC Datetime: 1997-04-09 04:30:35 UTC
Raw Date: Tue, 8 Apr 1997 21:30:35 -0700 (PDT)

Raw message

From: dlv@bwalk.dm.com (Dr.Dimitri Vulis KOTM)
Date: Tue, 8 Apr 1997 21:30:35 -0700 (PDT)
To: freedom-knights@jetcafe.org
Subject: Re: Anti-Spambot: what algorithm should be used?
In-Reply-To: <199704082129.QAA03372@manifold.algebra.com>
Message-ID: <wVcV5D93w165w@bwalk.dm.com>
MIME-Version: 1.0
Content-Type: text/plain


> Another problem: what would prevent an anti-spam zealot from posting
> the following:
>
>     From: antispammer@usenet.cabal.com
>     Newsgroups: rec.sports.phishing,alt.fan.alice
>     Message-ID: <123@bob.server>
>
>     <something phishing-related>
>     --
>     Alice's ID# 123456  Drink Kaka Kola -- Kaka Cola is full of shit,
> 			do not drink it, I am just collecting $$ from
> 			stupid Alice, Hahaha!!!

Yes, Bob could post something like "drink kaka cola and die" and
"elect John M Gubor Sheriff if you want to be arrested".
Hopefully, most Bob's won't.

I guess the solution is for Alice's clients to pick regexps such
that _any_ publicity is good publicity - such as a URL or a 900 number.
If the regexp is "John Gilmore is a cocksucker", what can Bob add to it? :-)

> remember though that it will not be hard to modify newsreader clients to
> automatically killfile all articles matching Alice's regexps, or even to
> grab them automatically from her website. So Alice may discover that her
> business is not as effective.

Killfiles? So many scumbags whine about their inability to use them,
so they try to silence whomever they don't like.

Yes, if Alice just posts her regexps on a public Web page, a clueful
person can automatically fetch them and add them to his killfile or
even forge cancels for Bobs' articles that contain them.

I think this would be a stupid thing to do because Bobs' articles
may contain useful information besides Alice's ad. Also the regexp
might occur in a followup to Bob's article, which may contain
useful information.

But of course a reader is welcome to cut off his own nose to spite
and face and to killfile anything he likes - as long as he doesn't
prevent others from reading what they feel like reading.

Alice can try to keep her list of regexp's "confidential" and only
give it out to Bobs that are the prospective posters. I don't think
this would work. Someone would pose as a potential agent, obtain
the list of regexps, and post it (perhaps via an anonymous remailer).
So Alice might as well make it public.

Another possible candidate for killfile trigger is the "Alice ID".
Alice can make it harder by not using one (as I originally proposed)
but instead by recognizing the address "From:" header.

> marketing with e-cash. Suppose Alice runs an advertizing agency and gets
> paid to publicize certain messages of commercial or political nature.
>
> She sets up a Web page listing the catalog of regexps that she wishes to
> publicize. Something like:
>
> 1. Visit.*teens.*http://www.xxxfoo.bar
> 2. Elect.*John M. Grubor.*sheriff
> 4. For a drooling good time.*900-555-5555
> 3. Drink Kaka Cola
>
> and a price list for each regexp - say, $2 for posting this regexp in a
> moderated newsgroup, 50c for the Big 8, 25c for alt and regionals. I'm
> not sure if cross-posts should be discounted. Assuming there are several
> such Alices, the prices will be set by supply and demand.

Alice doesn't want to pay Bob for posting in alt.i.just.made.this.up.
Therefore Alice should provide the explicit list of newsgroups that
she monitors and for which she pays.

> Suppose Bob, in search of a few quick bucks, comes across Alice's site.
> We'd have to think of a protocol, but Alice assigns Bob an id which he
> must mention together with the regexps in order to get credit. Alice and
> Bob enter a contract. Bob puts one or more of Alice's regexps in his
> Usenet articles - most likely in the .signature. For example:
>
>    From: bob
>    Newsgroups: rec.sports.phishing,alt.fan.alice
>    Message-ID: <123@bob.server>
>    <something phishing-related>
>    --
>    Elect John M. Grubor sheriff!!!            Alice's ID# 123456
>    I am Bob! I am Bob! I am Bob the Poster!   Drink Kaka Kola

On the second thought, Alice may not need a prior contract with Bob,
nor the ID#. She can just list the regexps she's trying to promote
on her web page and go promiscuously by the address in the From: header.

> Alice's bot searches the Usenet feed (like K*bo and S*rd*r) for the
> regexps that she is paid to promote. When she encounters Bob's article,
> it extracts Bob's ID and the Message-Id, counts the distinct regexps, and
> the newsgroups, and credits Bob's account.
>
> If Carol follows up on Bob's article and quotes Bob's regexps and ID#,
> then Bob gets paid again. If somehow an article contains the ID# of two
> or more of Alice's agents, they split the fee.

And if we forego the ID# and just look at the From: header then it's
simpler: if Carol quotes Bob's article and includes a regexp, then Carol
gets an e-mail from Alice with some e-cash, perhaps sent via an
anonymous remailer.

One problem is that a lot of people put fake addresses in the from:
fields. Why waste good e-cash on e-mail to addresses that bounce?

What Alice could do is send an e-mail like
 "Your article with message-id blah posted to blah contains
  a message that I'm paying to promote. Here is a cookie;
  if you return this cookie, I'll send you $n e-cash."
If it bounces, or if the recipient never sends back the cookie - too bad.

> Alice's contract can also specify that if a third party forges a cancel
> within a week for Bob's Usenet article containing the regexp, then Alice
> will pay nothing and let Bob sue the forger for the lost income. (This
> may become moot as more and more ISPs ignore forged cancels.) This gives
> Bob the insentive to spam intelligently - not to trigger any cancelbots
> and not to have his plug pulled by his ISP.

After Alice's bot finds an eligible Usenet article, it should wait
a week or so to see if it was cancelled or superseded before issuing a
payment. If it's cancelled, then instead of payment it should send
a notice saying:
 "Your article with message-id blah posted to blah contains
  a message that I'm paying to promote. I would have paid
  you $n e-cash, but unfortunately a cancel has been issued
  for your article: <quote cancel>"
That should get Bob pretty mad at the 3rd party canceller. :-)

> Alice can also put some reasonable caps on the number of repetitions
> because if Bob posts the same regexp 10,000 times, the marginal exposure
> is less from Bob than from a newbie Carol. Again, if there are several
> such Alices, the market will take care of negotiating such details.

Alice needs to maintain a map of (poster,newsgroup,regexp) to # posts;
when it reaches 100, stop paying this poster. I think 100 mentions by
the same poster is way past saturation.

Ross: yes, unfortunately the spambot is on the back burner right now,
but I definitely will finish it eventually.

---

Dr.Dimitri Vulis KOTM
Brighton Beach Boardwalk BBS, Forest Hills, N.Y.: +1-718-261-2013, 14.4Kbps





Thread