1993-10-06 - Identifying GIFs, was Re: criminal gif upload

Header Data

From: “Pat Farrell” <pfarrell@gmu.edu>
To: ssteele@eff.org
Message Hash: 21f3640247b41926718f79bfa794c79c7e6b432a8d40be8162463b2347c02336
Message ID: <1743.pfarrell@gmu.edu>
Reply To: N/A
UTC Datetime: 1993-10-06 04:29:11 UTC
Raw Date: Tue, 5 Oct 93 21:29:11 PDT

Raw message

From: "Pat Farrell" <pfarrell@gmu.edu>
Date: Tue, 5 Oct 93 21:29:11 PDT
To: ssteele@eff.org
Subject: Identifying GIFs, was Re: criminal gif upload
Message-ID: <1743.pfarrell@gmu.edu>
MIME-Version: 1.0
Content-Type: text/plain


In message Tue,  5 Oct 1993 17:11:17 -0400 (EDT),
  Matthew J Ghio <mg5n+@andrew.cmu.edu>  writes:
>  Seriously tho, just posting a list of MS-DOS filenames is rather
> useless as filenames do get changed.  It is highly likely that a sysop
> or user might have changed the filenames to something else, especially
> if their operating system supported filenames longer than 8 characters.

Doesn't this bring up a fundamental question: when is a file equivalent?
we can easily use MD5 or brik to identify identical files.
But GIFs, and other image files (MPEG, JPEG, TIFF, etc.) are subject to both
lossey compression and stegnagraphic [sic, sorry] coding techniques.
If you change  one pixel of the background, the checksums are different, but
it will still show *porm or whatever to a judge who "knows it when he sees
it."

We can prove statistical insignificance of duplication using strong
hashing functions. Can we find a way to statistically prove "looks like"
on a numerical basis?

Pat

Pat Farrell      Grad Student                 pfarrell@cs.gmu.edu
Department of Computer Science    George Mason University, Fairfax, VA
Public key availble via finger          #include <standard.disclaimer>





Thread