Chapter3

OllyMachine Assembly Language

Welcome to the real world!!!

OllyMachine Script is an assembly-like language, it's the most important interface between the OllyMachine and the programmers. Of course, you could directly write bytecodes, but I don't think it's worth to do that. ^_^ Actually, writing an assembler is much more harder than writing a Virtual Machine, I spent more than two weeks on this assembler.

3.1  Basic element

Every OllyMachine assembly source code can be separated into a series of statements. Every statement will engross a whole line, and there's no maximum length to the line. Statement has two forms:

Attention please: all instructions in OllyMachine assembly language are case-insensitive.

3.1.1  Instruction

An instruction is formed by an opcode and zero or multi-operands.

Instruction = Opcode [ Operand1 [Operand2 ... ] ]

3.1.1.1  Opcode

An opcode is a symbol, it defines the instruction's operation and the format of the whold instruction. The opcode I defined for OllyMachine assembly-language can fully decide its number of operands and the type of these operands.

Please see the two instructions below:

add reg00, 0x1234
add reg01, reg00

Opcode of "add" requires a general-register, a comma, and an immediate digit or another general-register following. Meanwhile, it also defines the operation: put 0x1234 to reg00, or put value in reg00 to reg01. So, opcode of "add" not only gives the instruction's operation, but also gives the whole format of the instruction.

3.1.1.2  Operand

An Operand gives the instruction's operating data. OllyMachine assembly-language can recognize three different forms of instruction operand. These operands's forms are shown following:

Operand Form Example
Register reg00¡¢eax
Identifier loop1¡¢_continue
Digit -100¡¢0x100
Figure 5:  OllyMachine assembly-language's operand forms

Register

In the OllyMachine Virtual Machine, there're 83 general registers and 3 hidden registers. Thereinto:

reg00, reg01, reg02 ... reg64, FreeBufferReg, FreeBufferSizeReg

The 67 registers above could be used freely by programmers. FreeBufferReg and FreeBufferSizeReg have their particular usages:

FreeBufferReg points to a 4k length buffer area, which is for script's decoding and data temporarily storing. FreeBufferSizeReg indicates the current size of FreeBuffer. If you want to get a string, these two registers will be very useful.

Attention!

Programmers should try their best to avoid directly operating on FreeBufferReg and FreeBufferSizeReg, because they're generally handles by corresponding API. If programmers directly modify their value, may probably cause some strange result.

Another 9 registers are:

eax, ecx, edx, ebx, esp, ebp, esi, edi, eip

Programmers should be very carefully operate these nine registers, because all operations on them are equal to what you do on the OllyDbg's current debugging process. For example:

mov reg00, eax

Will put value of the current debugging process's eax register to reg00.

mov eax, 0x12345678

Will put 0x12345678 to current debugging process's eax register.

Rest 7 registers are:

CF, PF, AF, ZF, SF, DF, OF

These 7 registers are for OllyDbg's eflags register. For example:

not cf
mov zf, 0
mov pf, 1

Identifier

Identifier is made up of a sequence of characters. Identifier's purpose is to naming LABEL. The first byte of identifier must be a letter(a-z and A-Z), or an underline("_"). The second byte and following bytes can be letters, underline or digits:

Here are some valid identifiers:

_1continue
exit0
loop1

Here are some invalid identifiers:

1continue
.exit0
my_#_loop2

Digit

There are two forms of digit: decimal and hex.

Decimal digits needn't have any prefixes. For example:

100¡¢-1234

Hex digits must have prefix "0x" or "0X". For example:

0x100¡¢-0x1234

Both dicimal and hex digits can be positive or negative. Positive needn't have prefix "+" (if have, it will be failed to pass the assembler), negative needs to have prefix "-". Attention, digits could not greater than 0xFFFFFFFF.

3.1.2  Comment

Comment is some text that will be ignored by assembler. We could write comment to make the source code more readable. Two forms of comment:

Line comment:

// line comment 1.
; line comment 2.

Block comment:

/*
   this is a block comment.
*/

3.2  LABEL

The format of a LABEL name is a colon following the identifier, for example:

Error0:

After defination, we could use jump instructions to control the program's flow, for example:

jmp Error0
// instructions ...
Error0:
// other instructions ...

One thing we should know: if a LABEL is not referenced by any instructions, the assembler will give you a warning.

3.3  Data Transfer instructions

The data transfer instructions move data between memory and the general-purpose and segment registers. They also perform specific operations such as conditional moves, stack access, and data conversion.

3.3.1  MOV

Format: MOV DST, SRC
Operation: (DST) <- (SRC)

Two forms:

Example£º

MOV reg00, 0x100
MOV reg00, reg01

3.3.2  XCHG

Format: XCHG OPR1, OPR2
Operation: (OPR1) <-> (OPR2)

Only one form:

Example:

mov reg01, 1
mov reg02, 2
XCHG reg01, reg02   // now reg01 == 2, reg02 == 1

XCHG instruction will not affect eflags.

3.3.3  LDS

Format: LDS reg, "string"

Example:

LDS reg00, "Hello World!"

We can remember LDS as "LoaD String", is it much easier now? ^_^

3.3.4  PUSH

Format: PUSH SRC
Operation: (ESP) <- (ESP) - 4
¡¡¡¡¡¡¡¡¡¡((ESP) + 4, (ESP)) <- (SRC)

Two forms:

Example:

PUSH 0x100
PUSH reg00

3.3.5  POP

Format: POP DST
Operation: (DST) <- ((ESP) + 4, (ESP))
¡¡¡¡¡¡¡¡¡¡(ESP) <- (ESP) + 4

Only one form:

Example:

POP reg00

3.4  Arithmetic Instructions

The arithmetic instructions perform basic binary integer computations on byte, word and doubleword integers located in memory and/or the general purpose registers.

3.4.1  Addition Instructions

3.4.1.1  ADD

Format: ADD DST, SRC
Operation: (DST) <- (SRC) + (DST)

Two forms:

Example:

ADD reg00, 0x100
ADD reg00, reg01

Attention: ADD instruction will affect CF flag.

3.4.1.2  INC

Format: INC DST
Operation: (DST) <- (DST) + 1

Only one form:

Example:

INC reg00

INC instruction will not affect eflags.

3.4.2  Subtraction Instructions

3.4.2.1  SUB

Format: SUB DST, SRC
Operation: (DST) <- (DST) - (SRC)

Two forms:

Example:

SUB reg00, 0x100
SUB reg00, reg01

Attention: SUB instruction will affect CF and ZF flag.

3.4.2.2  DEC

Format: DEC DST
Operation: (DST) <- (DST) £­ 1

Only one form:

Example:

DEC reg00

DEC instruction will not affect CF flag, but will affect ZF flag.

3.4.2.3  CMP

Format: CMP OPR1, OPR2
Operation: (OPR1) - (OPR2)

Two forms:

Example:

CMP reg00, 0x100
CMP reg00, reg01

Attention: the CMP instruction computes the difference between two integer operands and updates the CF and ZF flags according to the result. The source operands are not modified, nor is the result saved. The CMP instruction is commonly used in conjunction with a Jcc(jump) instruction, with the latter instructions performing an action based on the result of a CMP instruction.

3.4.3  Multiplication Instructions

3.4.3.1  MUL

Format: MUL DST, SRC
Operation: (DST) <- (DST) * (SRC)

Two forms:

Example:

MUL reg00, 0x100
MUL reg00, reg01

MUL instruction will not affect eflags.

3.4.4  Division Instructions

3.4.4.1  MUL

Format: DIV DST, SRC
Operation: (DST) <- (DST) / (SRC)

Two forms:

Example:

DIV reg00, 0x100
DIV reg00, reg01

DIV instruction will not affect eflags.

Attention: could not divided by ZERO!!!

3.5  Logical Instructions

The logical instructions AND, OR, XOR(exclusive or), and NOT perform the standard Boolean operations for which they are named. The AND, OR, and XOR instructions require two operands; the NOT instruction operates on a single operand.

3.5.1  Logical Operatioin Instructions

3.5.1.1  AND

Format: AND DST, SRC
Operation: (DST) <- (DST) & (SRC)

Two forms:

Example:

AND reg00, 0x100
AND reg00, reg01

Attention: AND instruction will affect ZF flag, and set 0 to CF flag.

3.5.1.2  OR

Format: OR DST, SRC
Operation: (DST) <- (DST) | (SRC)

Two forms:

Example:

OR reg00, 0x100
OR reg00, reg01

Attention: OR instruction will affect ZF flag, and set 0 to CF flag.

3.5.1.3  NOT

Format: NOT DST
Operation: (DST) <- !(DST)

Only one form:

Example:

NOT reg00

NOT instruction will not affect eflags.

3.5.1.4  XOR

Format: XOR DST, SRC
Operation: (DST) <- (DST) ^ (SRC)

Two forms:

Example:

XOR reg00, 0x100
XOR reg00, reg01

Attention: XOR instruction will affect ZF flag, and set 0 to CF flag.

3.5.2  Shift Instructions

The SHL(shift logical left), SHR(shift logical right) instructions perform a logical shift of the bits.

3.5.2.1  SHL

Format: SHL DST, SRC

Two forms:

Example:

MOV reg00, 0x10
SHL reg00, 8        // reg00 is now 0x1000

MOV reg00, 0x10
MOV reg01, 8
SHL reg00, reg01    // reg00 is now 0x1000

SHL instruction will not affect eflags.

3.5.2.2  SHR

Format: SHR DST, SRC

Two forms:

Example:

MOV reg00, 0x1000
SHR reg00, 8        // reg00 is now 0x10

MOV reg00, 0x1000
MOV reg01, 8
SHR reg00, reg01    // reg00 is now 0x10

SHR instruction will not affect eflags.

3.6  Control Transfer Instructions

The processor provides both conditional and unconditional control transfer instructions to direct the flow of program execution. Conditional transfers are taken only for specified states of the status flags in the EFLAGS register. Unconditional control transfers are always executed.

3.6.1  Unconditional Transfer Instructions

The unconditional transfer instructions transfer program control to another location(destination address) in the instruction stream.

3.6.1.1  JMP

Format: JMP label
Operation: (EIP) <- (EIP) + 32-bit offset

Example:

jmp Error0
mov reg01, 0x200
Error0:
mov reg00, 0x100

3.6.2  Conditional Transfer Instructions

The conditional transfer insructions execute jumps or loops that transfer program control to another instruction in the instruction stream if specified conditions are met. The conditions for control transfer are specified with a set of condition codes that define various states of the status flags(CF, ZF) in the EFLAGS register.

3.6.2.1  JE

Equal/zero then transfer.

Condition: ZF = 1

3.6.2.2  JNE

Not euqal/not zero then transfer.

Condition: ZF = 0

3.6.2.3  JB

Below/not above or equal then transfer.

Condition: CF = 1

3.6.2.4  JNAE

The same as JB.

3.6.2.5  JNB

Above or equal/not below then transfer.

Condition: CF = 0

3.6.2.6  JAE

The same as JNB.

3.6.2.7  JBE

Below or equal/not above then transfer.

Condition: CF = 1 or ZF = 1

3.6.2.8  JNA

The same as JBE.

3.6.2.9  JNBE

Above/not below or equal then transfer.

Condition: CF = 0 and ZF = 0

3.6.2.10  JA

The same as JNBE.

3.7  Misc Instructions

Misc instructions, because I don't know which group to put them in:

3.7.1  INCLUDE

Format: INCLUDE "filename.oms"

Includes another source file to the current source file to assemble together.

3.7.2  NOP

Format: NOP

NOP instruction performs no operation. This instruction is a one-byte instruction that takes up space in the instruction stream but does not affect the machine context, except the EIP register.

3.7.3  PAUSE

Format:PAUSE

PAUSE instruction will make Virtual Machine pause, if you want to make Virtual Machine continue running, please choose plugins menu "OllyMachine -> Resume".

3.7.4  HALT

Format: HALT

HALT instruction will make the machine halt... all over.

3.7.5  INVOKE

Format: INVOKE Api_Name, parameter1, parameter2, ...

INVOKE macro is for API invoking, similar to MASM32's invoke macro.