1997-01-07 - Re: PWL’s how ?

Header Data

From: Neil Tunnicliffe <intaus2@ois.net.au>
To: cypherpunks@toad.com
Message Hash: 56d14761dc71e6c28c31c89e2ce2656c5fee2e80e130897343abb7188ec85b87
Message ID: <199701070519.NAA00657@hawk.ois.net.au>
Reply To: N/A
UTC Datetime: 1997-01-07 05:20:36 UTC
Raw Date: Mon, 6 Jan 1997 21:20:36 -0800 (PST)

Raw message

From: Neil Tunnicliffe <intaus2@ois.net.au>
Date: Mon, 6 Jan 1997 21:20:36 -0800 (PST)
To: cypherpunks@toad.com
Subject: Re: PWL's how ?
Message-ID: <199701070519.NAA00657@hawk.ois.net.au>
MIME-Version: 1.0
Content-Type: text/plain


At 21:34 2/01/97 -0500, you wrote:
>How has the Windows PWL's been de-crytped ?  
>
>Also I don't like to seem too much like a rookie, but was is  a polymorphic
>virus ?
>                   .           Marc Theriault     *       .   *          +
>   *      .    *             Email: WILL@BBSI.NET        *    .   * .
>                   .+                ___            .
> .    *   +             ___....-----'---'-----....___        .    .
>                  =========================================       +  *
>   *           .         ___'---..._______...---'___        * .    .    
>

Read this for an understanding of a polymorhphic virus:

ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ"
º A GENERAL DESCRIPTION OF THE METHODS BEHIND A POLYMORPH ENGINE º
ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ1/4

A small glossary of terms:
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

      ENCRYPT     = Transform from it's original form to an altered form.
      DECRYPT     = Transform from it's altered form to it's original form.
      KEY         = The register or value used to encrypt/decrypt with.
      SLIDING KEY = A KEY value that is INCREASED or DECREASED on each loop.
      COUNT       = The number of bytes in the encrypted code or data.
      INDEX       = A pointer to the encrypted code or data.
      SIGNATURE   = A unique group of bytes that can be used to check against
                    a programs content in the hope of detecting a particular
                    program.
      HEURISTIC   = A set of well defined rules to apply to a problem in the
                    hope of achieving a known result.

Question:  What is a Polymorph?
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
Answer:    Well, the Longman English Dictionary defines it as:

   "POLYMORPHOUS also POLYMORPHIC adj fml or tech.
     EXISTING IN VARIOUS DIFFERENT FORMS."

In other words, something that has the ability to change it's shape.  Other
ways to describe such a thing might be;  Mutable, Metamorphic, Etc...

Question:  What is a Polymorph Engine?
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
Answer:    A program with the abilities to encrypt (or jumble up) another
           program or data and provide a unique decryptor for it, it must
           do this in such a way that no two encryptions of the same program
           or data will look alike.

Example:  Take the following ultra-simple decryptor:

               MOV      SI,jumbled_data     ;Point to the jumbled data
               MOV      CX,10               ;Ten bytes to decrypt
main_loop:     XOR      BYTE PTR [SI],55    ;XOR (un_scramble!) a byte
               INC      SI                  ;Next byte
               LOOP     main_loop           ;Loop for the 9 remaining bytes

This small program will XOR the ten bytes at the location pointed to by SI
with the value 55.  Providing the ten bytes were XORed with 55 prior to
running this decryptor the ten bytes will be restored to their original
state.  If you are unsure as to why this is, brush up on your XOR logic!!

Ok, so you might say that if you change the KEY value on each generation it
will become Polymorphic?  Well, yes and no!  If you did that, the encrypted
portion would be Polymorphic, but the decryptor would still remain mostly the
same, the only change begin the KEY value!  So, a signature scanner that
allows WILDCARDS (and most do!) would still be able to find your decryptor!

One way you could fool some signature scanners is to swap around some of the
instructions.  So, with this in mind, the above decryptor might look like:

               MOV      CX,10
               MOV      SI,jumbled_data
main_loop:     XOR      BYTE PTR [SI],55
               INC      SI
               LOOP     main_loop

As you can see, still not much of a change, not really enough to fool some of
the better signature scanners.

"GET TO THE POINT!  WHAT IS A TRUE POLYMORPH?", I hear you cry!
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Well, a "true" Polymorph would be a decryptor that looks completely different
on each generation!  Take the following decryptor:

               MOV      CX,10
               NOP
               NOP
               MOV      SI,jumbled_data
               NOP
main_loop:     NOP
               NOP
               XOR      BYTE PTR [SI],55
               NOP
               INC      SI
               NOP
               NOP
               NOP
               NOP
               LOOP     main_loop

This decryptor is the same as the one before it, but it has has a few random
NOP instructions peppered throughout itself.  On each generation you would
vary the amount of NOPs after each instruction.  This is a Polymorph in it's
simplest form.  Still, most of the good signature scanners would have no
problem with such a simple Polymorph.  They would simply skip the NOPs, thus
having a clear view of the decryptor, to which they could apply a signature!

No, a "true" Polymorph has to be far far more complex then this!  Instead of
peppering NOPs throughout the decryptor it would pepper totally random amounts
of totally random 8086 instructions, including JUMPS and CALLS.  It would
also use a different main decryptor (possibly from a selection of pre-coded
ones) and would alter all the registers that the decryptor uses on each
generation, making sure that the JUNK code that it generates doesn't destroy
any of the registers used by the real decryptor!  So, with these rules in
mind, here is our simple decryptor again:

               MOV      DX,10              ;Real part of the decryptor!  
               MOV      SI,1234            ;junk
               AND      AX,[SI+1234]       ;junk
               CLD                         ;junk
               MOV      DI,jumbled_data    ;Real part of the decryptor!
               TEST     [SI+1234],BL       ;junk
               OR       AL,CL              ;junk
main_loop:     ADD      SI,SI              ;junk instruction, real loop!
               XOR      AX,1234            ;junk
               XOR      BYTE PTR [DI],55   ;Real part of the decryptor!
               SUB      SI,123             ;junk
               INC      DI                 ;Real part of the decryptor!
               TEST     DX,1234            ;junk
               AND      AL,[BP+1234]       ;junk
               DEC      DX                 ;Real part of the decryptor!
               NOP                         ;junk
               XOR      AX,DX              ;junk
               SBB      AX,[SI+1234]       ;junk
               AND      DX,DX              ;Real part of the decryptor!
               JNZ      main_loop          ;Real part of the decryptor!

As you should be able to see, quite a mess!!  But, still executable code.
It is essential that any junk code generated by the Polymorph Engine is
executable, as it is going to be peppered throughout the decryptor.  Note, in
this example, that some of the junk instructions use registers that we are
using in the decryptor!  This is fine, providing the values in these
registers aren't destroyed.  Also note, that now we have random registers and
random instructions on each generation it makes signature scanning (even for
the clever signature scanners) impossible!  Instead, an HEURISTIC method must
be used, which can lead to false alarms.

So, a Polymorph Engine can be summed up into three major parts:
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

  1 .. The random number generator.
  2 .. The junk code generator.
  3 .. The decryptor generator.

There are other discrete parts but these three are the ones where most of the
work goes on!

How does it all work?  Well, SMEG goes about generating random decryptors in
the following way:

  1 .. Chooses a random selection of registers to use for the decryptor.
       Leaving the remaining registers as "junk" registers for the junk code
       generator.

  2 .. Chooses one of the compressed pre-coded decryptors.

  3 .. Goes into a loop generating the real decryptor, peppered with junk
       code.

To understand how the selected registers are slotted into the decryptors and
the junk code you must look at the 8086 instructions from a binary level:

      XOR   AX,AX    =    00110001 11000000
      XOR   AX,CX    =    00110001 11001000
      XOR   AX,DX    =    00110001 11010000
      XOR   AX,BX    =    00110001 11011000

You should be able to see a pattern in the binary code for these four 8086
instructions?  Well, all 8086 instructions follow logical patterns, and it is
these patterns that tell the 8086 processor which registers/addressing mode
to use for a particular instruction.  The total amount of instruction formats
and the precise logic regarding the patterns is too complex to go into here. 
However, all good 8086 tutorials/reference guides will explain in full.

SMEG exploits this pattern logic to generate junk code and decryptors with
random registers, as the patterns directly relate to the registers Etc.

SMEG generates junk code in the following way:
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Inside SMEG there is a table of the basic binary patterns for all of the 8086
instruction set, but with one important difference, all the register/address
mode bits are zero.  This is called the SKELETON INSTRUCTION TABLE.  The
table also contains various other bytes used by SMEG to determine the
relevant bit positions to "plug in" the register bit patterns.  These
patterns are plugged in via the logic processes OR and AND.  Using this
method, SMEG can generate endless amounts of random 8086 instructions without
destroying any of the registers used by the decryptor proper.
SMEG also contains some discrete logic for producing false CALLS to dummy
subroutines and also false conditional JMPS around the junk code.

SMEG generates the decryptor proper in the following way:
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Inside SMEG there is a table containing a selection of common 8086
instructions used in decryptors, such as XOR [index],reg Etc.  These are,
again, stored in SKELETON FORM with some control bytes used by the decryptor
generator.  Also, inside SMEG, there are several pre-coded decryptors stored
in a compressed form.  On average, a complete decryptor can be described to
the decryptor generator in as few as 11 bytes and adding to the list of
pre-coded decryptors is both painless and economical with space!

SMEG generates the Polymorphed decryptor in the following way:
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

First it chooses, at random, one of the pre-coded compressed decryptors. 
Next it goes into a loop uncompressing each decryptor instruction, plugging
in the required registers, storing it and then generating (for each real
instruction) a random amount of random instructions.  This loop repeats until
the complete decryptor has been constructed.  The final result is a random
size, random register, random patterned decryptor!

It should also be noted that whenever SMEG generates an INDEXed instruction
it uses either SI, DI or BX at random, also it sometimes uses a random offset.
For example, say the encrypted code started at address 10h, the following
could be used to index this address:

       MOV   SI,10h     ;Start address
       MOV   AL,[SI]    ;Index from initial address

But sometimes SMEG will generate something like the following, again based on
the encrypted code starting at address 10h:

       MOV   DI,0BFAAh      ;Indirect start address
       MOV   AL,[DI+4066h)  ;4066h + 0BFAAh = 10010h (and FFFF = 10h)!!

These indexed and initial values are picked at complete random, and the
examples of 0BFAAh and 4066h are valid, but next time they will be completely
different!

The following are two decryptors that were generated with my SMEG Polymorph
Engine.  It should be noted that I generated 4000 examples with no two alike!
Unfortunately I ran out of hard drive space!  But it is fairly safe to say
that the total number of decryptor combinations would run into the BILLIONS!

All the lines marked with ";junk" in the following listings indicate random
junk instructions that were inserted throughout the actual decryptor, note
that SMEG has the ability to generate junk CALLS to false SUBROUTINES, as
well as general junk conditional jumps!  All lines marked with a * indicate
an actual part of the decryptor proper.  I chose the two generations shown
because their sizes were similar, 386 and 480 bytes.  SMEG produces
decryptors ranging in size from as little as 288 to as much as 1536 bytes.
Even if two decryptors are generated that are the same size the chances of
them being the same are, literally, billions to one!


Neil Tunnicliffe.






Thread