From: Marc Horowitz <marc@MIT.EDU>
To: cypherpunks@toad.com
Message Hash: 0ba472f7ef23a2410cc70cb7104ac1c5766180b12cf48d6d96bad77b9d728ee3
Message ID: <9311280438.AA13296@steve-dallas.MIT.EDU>
Reply To: N/A
UTC Datetime: 1993-11-28 04:39:19 UTC
Raw Date: Sat, 27 Nov 93 20:39:19 PST
From: Marc Horowitz <marc@MIT.EDU>
Date: Sat, 27 Nov 93 20:39:19 PST
To: cypherpunks@toad.com
Subject: New toy for those of you with real cash flow.
Message-ID: <9311280438.AA13296@steve-dallas.MIT.EDU>
MIME-Version: 1.0
Content-Type: text/plain
------- Forwarded Message
JOURNAL OF RECREATIONAL MATHEMATICS Volume 25, Number 1 -- 1993
Book Reviews, edited by Charles Ashbacher
Note: The following is neither a book nor a software review. It is a
review of a piece of hardware--but it should be of prime interest to
readers of this journal.
The Dubner PC Cruncher-A Microcomputer Coprocessor Card for Doing Integer
Arithmetic, H & R Dubner, 449 Beverly Road, Ridgewood, New Jersey 07450,
Telephone 1-201-652-1825. $2,500.
Need more processor speed for arithmetic operations? Try the Dubner
"Cruncher." This is an add-in board for IBM PC Compatibles (ISA bus) which
is designed to quickly add, subtract, multiply, and divide large numbers.
For example, to check a number of 1000 digits for probable-primality using
Fermat's theorem (a^p = a(mod p) whenever p is prime) takes about two
minutes on a typical 486DX machine, but less than five seconds if this same
machine has a Cruncher installed. The improvement in speed depends on the
size of the numbers. Numbers with 100 digits are multiplied about six
times as fast, numbers with 1000 digits about forty times as fast, and
those with 10,000 digits are multiplied about ninety-eight times as fast.
The Cruncher is currently limited to integer arithmetic. Its speed
comes from the use of a chip designed for real time digital signal
processing (LSI Logic's L64240 MFIR) which can perform 1.28 billion
multiply/add operations per second. Access is provided to this power on
two levels: through a menu interface and through a set of ANSI compatible
C language routines.
The menu-based interface is primitive but workable. It allows you to
enter numbers from the keyboard and then perform all the various Cruncher
operations, including the basics operations, using any of several
factorization methods, access to a primality proving algorithm, and even
the ability to screen for primes of certain forms, such as abc + 1 where
either a or c is incremented during the search.
Also included are the necessary tools for a programmer to access the
Cruncher's power from within their own C programs. This includes a set of
instructions similar to those in an assembly language, for example,
hmult(a,b,c) multiplies the numbers pointed to by b and c and places the
result in the location that a points to. The procedure hpowerm(a,b,c)
raises a to the b-th power modulo c and places the result in a. Functions
are included to do the basic arithmetic operations as well as such things
as calculating n!, ab(mod c), and the integer part of the n-th root of a
number. These instructions are easily used in any C program and make it
unnecessary for the user to know how the hardware actually works. I
learned C just to use the Cruncher in my own research and the speed was
well worth the effort!
The near super-computer speed provided by the Cruncher is very
impressive! For example, I recently used the Cruncher on my 386 machine to
search for the next prime of the form n!-1 . After about five days I found
it: 3507!-1 (with 10,912 digits.) This search would have taken about a
year using a very fast 486! Though it is not necessary, I usually use the
Cruncher from within the Microsoft Windows environment - that way it can
crunch away twenty-four hours a day in the background and I can still do
my word processing and programming in the foreground. However, Windows
does slow the Cruncher down a few percent.
The Cruncher's price, $2,500, is not cheap, but is definitely the
least expensive way to make your machine multiply and divide fifty times
faster. In short: 1) the speed is unbelievable; 2) the menu interface is
workable, but if your interests diverge too much from the Dubners', you
will need to be able to program in C; and 3) if you program in C, the
tools for accessing Cruncher's power are excellent.
Chris Caldwell
Mathematics and Computer Science
University of Tennessee at Martin
Martin, Tennessee 38238
The Dubner PC Cruncher
We can give you the computational power you've only dreamed of.
Dollar for dollar, we believe that we have built the most powerful
hardware available for doing multiple-precision integer arithmetic.
Here is a list of execution times for Fermat testing of numbers
with 1,000 decimal digits:
Vaxstation 3100/38 770 sec
Sun 4/330 377 sec
Decstation 3100 287 sec
IBM RS 6000/320 128 sec
486/33 128 sec
IBM 3090 19 sec
486/33 with Dubner PC Cruncher 4.4 sec
Fujitsu VP 2200/10 2.1 sec
You'll note that the Fujitsu VP 2200/10 can test a 1,000-digit
number faster than the Cruncher. But the Fujitsu is a multi-
million dollar supercomputer comparable to a four-processor Cray
XMP.
The PC Cruncher is an add-in board that plugs into your
IBM-compatible PC/AT. Its cost: $2,500.
Furthermore, the Cruncher becomes more efficient as the numbers
grow larger. At 3,000 digits, the Cruncher is slightly faster than
the Fujitsu. By 10,000 digits, the Cruncher is 1.5 times faster.
The chances are that you already have a PC which you probably turn
off at night. Install the Cruncher, run it during those nighttime
hours, and you can get the same amount of computing done as if you
had access to ten or more hours of supercomputer time.
How can we do this?
We are Harvey Dubner and Bob Dubner. Harvey is an electrical
engineer and computer systems designer with a long-term interest in
mathematical theory, particularly in number theory (with over
twenty published papers.) Bob is an electronic design engineer
with a lot of experience in high-speed digital systems. The PC
Cruncher comes from that combination of talent: Harvey has an
insatiable need for processing power, and Bob likes to design
powerful processors. Over the last ten years we have built and
programmed a succession of high-speed, number-crunching circuits
that augment the ability of personal computers to do multiple-
precision arithmetic.
With the previous version of the hardware, we discovered over half
of all the known prime numbers of more than 2,000 digits. With 632
numbers, we were by far the biggest contributor to that list. The
PC Cruncher is roughly ten times faster than the previous version,
and we are very excited about the contributions to theory that will
be made when many investigators have this kind of computational
power available.
Description of the board: The hardware consists of a full-sized
circuit board that requires a 16-bit slot in an IBM-compatible
PC/AT. At the heart of the design is a 64-tap Finite-Impulse-
Response filtering chip made by the LSI Logic Corporation. It is
this chip's ability to perform 1.28 billion multiply/add operations
per second that gives the PC Cruncher its remarkable performance.
(And it is the chip's $1,100 purchase price that makes the PC
Cruncher as expensive as it is.) The board dissipates about three
or four Watts, and has no special space, power, or cooling
requirements.
The PC Cruncher has 256K of on-board memory for storing operands,
and another 64K of high-speed memory for accumulating intermediate
results. All communication between the host and the Cruncher is
done with I/O ports -- the Cruncher doesn't use up any valuable
memory space.
The software for driving the Cruncher will run fastest on a 386 or
486 executing in protected mode. We use Symantec's Zortech C
compiler, because it easily and conveniently generates 32-bit
protected-mode code. All of the code has been written in strictly
ANSI-compatible C, with some MASM 5.1 assembly language.
When you buy the Cruncher, you will receive the board, complete
schematics, complete documentation for the programmable logic on
the board, complete source-level C and MASM code for driving the
board, and documentation for that code.
Please contact us for additional information.
Order from: ( USmail, Email, Fax, or telephone )
Dubner International, Inc.
13 Westervelt Place
Westwood, NJ 07675
Tel: 201-664-6434
Fax: 201-358-9377
Tel: Harvey Dubner 201-652-1825
Robert Dubner 201-664-6434
E-mail: Harvey Dubner 70372.1170@compuserve.com
Robert Dubner 73247.2334@compuserve.com
PC Cruncher Performance Benchmarking:
Assessing just how fast the PC Cruncher can calculate depends on many
things. The major factor is the size of the operands that are being
operated on.
Consider multiplication. The PC Cruncher multiplies using the schoolboy
algorithm -- it does long multiplication just like you do, except that
instead of multiplying and accumulating 2-digit products, the Cruncher
multiplies and accumulates 308-digit products. And it only takes the
hardware about 6.4 microseconds to multiply-and-accumulate a 308-digit
product. Unfortunately, it can take dozens of microseconds of fiddling
with the PC's sluggish ISA buss to set up the hardware to create that 308-
digit product. The efficiency at that point is relatively low. Even at
1,000 digits, the software is spending about half its time frantically
trying to get the hardware going. At about 4,000 digits, the effects of
overhead are reduced, but they continue to be significant until about
10,000 digits.
To give you some idea of what this means, let's look at the time needed to
square an N-digit number. Here is a comparison of a 486/33 with a PC
Cruncher, compared with a 486/33 running a highly-optimized multiple-
precision multiply routine:
Number of digits Time without Cruncher Time with Cruncher Factor
100 .000220 secs .000037 secs 6
500 .004000 .000165 24
1,000 .015000 .000357 42
5,000 .375000 .004400 85
10,000 1.500000 .015283 98
Division performance is even more complicated. The basic inner loop of
division is more complicated than that of multiplying, with even more
overhead. Dividing a 200-digit number by a 100-digit number is about 6
times slower than squaring a 100-digit number; at 2,000 by 1,000 digits
division is 3 times slower than the corresponding multiply, and at 20,000
by 10,000 digits division is 1.7 times slower. Added to all this is a
600-microsecond pre-division calculation time that must be added to all
randomly-started divisions. Happily, this calculation need only be done
once whenever repeated divisions by the same number are executed, as in
the heavily-used 'A**B MOD C' function.
In summary:
In the 50 to 200 digit range, a 486/33 PC equipped with a PC Cruncher
board will be 3 to 10 times faster than the same machine without the
Cruncher. The actual speed will depend on the size of the numbers and the
function mix. As you go up to 1,000 digits, performance will be 20 to 40
times faster. Past 4,000 or so digits performance will be 50 to 100 times
faster, which will put you in supercomputer territory.
If you start with a slower PC, the relative performance gains are even
more spectacular, since the Cruncher's performance is hindered only a
little by the slower processor once you get past a few hundred digits. At
100 digits, a Cruncher-equipped 386/20 will be about 10 to 30 times faster
than the same machine without the board, and at 4,000 digits and up will
be 150 to 300 times faster.
Dubner PC Cruncher -- Other Possibilities
RSA Encrypting/Decrypting
A 486/33 equipped with a Dubner PC Cruncher board can perform 1,024-bit
A**B MOD C calculations in about 0.473 seconds. You could therefore
decrypt an entire message at the rate of about 270 bytes per second --
ght times faster than an unaided 486/33.
re prepended to a
DES-encrypted messages, can get going in about one-half second instead
body out there have a network that needs a key
Need more speed? Let usere is enough interest, we
can build this same basic hardware onto an EISA slave board, instead of an
ISA board. We have yet to do a complete analysis, but our feeling is we'd
get about a three-fold performance increase in the 1,024-bit range. This
added speed would come from a reduction in overhead when manipulating
these smallish 308-digit numbers.
Need even more speed? Talk to us. It just costs money. The PC Cruncher
design would scale up nicely with two or four of the big FIR chips. For
somewhere between $6,000 and $10,000 you could blow the maintenance panels
off of anything. But we'd need a sponsor.
Used to working with workstations? Well, you could spend about $2,000 on
a 486/33, add in the $2,500 PC Cruncher, write some software and stick it
on the network as a number crunching server, and hardly notice the
difference. Or talk to us -- with sufficient interest we could build a
Cruncher with its own on-board RISC processor, and stick a SCSI port on it
so it'll plug right into a workstation. Not a bad idea -- we could get
rid of a lot of overhead that way, and get real improvements in the 100-
digit and up range. But again, we'd need a sponsor.
------- End of Forwarded Message
Return to November 1993
Return to “Marc Horowitz <marc@MIT.EDU>”
1993-11-28 (Sat, 27 Nov 93 20:39:19 PST) - New toy for those of you with real cash flow. - Marc Horowitz <marc@MIT.EDU>