# This guide assumes you know x86_64 assembly
If you do not know x86_64 assembly there are many guides online or you could read chapter -2 and chapter 2 of my guide on exploitation.
## Introduction to some terms
A disassembler is a program that takes a program's raw machine code and reconstructs what the program will do. A disassembler shows a the flow of the program in assembly, for example a disassembler might show assembly code that jumps based on a condition like this:
![[binja_screenshot.webp]]
Static analysis is the process of analyzing a program without executing it and dynamic analysis is analyzing the program when it's running. With a disassembler you mainly just look at the assembly of a program without actually running it, therefore you statically analyze it most of the time.
Sometimes when looking at the assembly code of a program, the code may have some anti reverse engineering techniques applied to it that makes it harder to understand. Anti reverse engineering, also called anti RE is a group of techniques designed to make the analysis of a program harder.
Obfuscation is a technique used to transform data without losing any of the meaning and deobfuscation is restoring the obfuscated data to it's original state.
A packer is a program that takes a program and creates a new program that contains an obfuscated version of that program and code to deobfuscate it and transfer execution to it. This program that is created is called a packed program. The stub is the part of the packed program that deobfuscates the obfuscated program. The instruction of the packed program's code that transfers execution to the deobfuscated program is called the tail jump. The start of the deobfuscated program is called the original entry point or OEP for short. Packers can obfuscate the code, data, imports, etc of the program it has contained which makes statically looking at a packed program very hard.
A control flow graph or CFG for short, is all the possible execution paths in a program. Most anti RE techniques aim to make the CFG messy. A basic block is code that has a single start and a single end, basic blocks typically end in either a call, a ret, a syscall, an interrupt or a jump.
A dispatcher is a basic block that determines what basic block to execute next.
A state variable is used by the dispatcher to decide which basic block executes next.
A virtual machine or VM for short is a custom execution environment with it's own state. A VM has it's own instructions that can modify it's state, for example say we have a VM and it has 3 registers and one instruction that adds the value of the first two registers to the third one. Bytecode is a custom instruction format used by the VM to represent logic that is not directly executed by the CPU. The program understands each bytecode instruction and runs it. An opcode is a part of every VM instruction that specifies which VM instruction it is. A handler is a basic block of assembly that implements the logic of each VM instruction. VMs have an interpreter loop which fetches the custom bytecode instructions and transfers execution to the correct handler.
A layer is one anti RE technique applied to a program.
A protector is system that combines multiple layers to a program.
A stage is a phase of execution where a specific layer is active.
A trampoline is used to redirect the flow of execution.
There are a lot of anti RE techniques as follows.
## Indirect calls and jumps
An indirect call/jump is when a call or jump happens on a register or through a pointer. The ``ret`` instruction performs an indirect jump since it pops off and jumps to whatever is on the top of the stack. For example:
```
global _start
section .text
add_two_numbers:
mov rax, rdi
add rax, rsi
ret
_start:
add rsp, 8
mov edi, 7
mov esi, 8
mov rax, add_two_numbers
call rax
mov rax, _start
push rax
jmp [rsp]
```
You can also do the following:
```
global _start
section .text
add_two_numbers:
mov rax, rdi
add rax, rsi
ret
_start:
mov edi, 7
mov esi, 8
mov qword [rel some_global], add_two_numbers
call [rel some_global]
mov qword [rel some_global], _start
jmp [rel some_global]
section .data
some_global dq 0
```
With an indirect call/jump the disassembler may be confused on where the next basic block is going to be, which may create an incorrect CFG, for example:
![[binja_screenshot_04.png]]
## Jump instructions with the same target
A common anti RE technique is where we have conditional jumps that jump to the same place, for example:
```
global _start
section .text
_start:
mov al, 7
cmp al, 77
jz label
jnz label
inc al
jmp _start
label:
jmp _start
```
The above technique would make it trickier for a someone looking at disassembler to determine what is happening because the disassembler shows multiple jumps that all go to the same place cluttering the CFG, for example:
![[binja_screenshot_05.png]]
## Breaking into jumps
A common anti RE technique is to break up a single basic block into multiple basic blocks that jump to the next basic block, for example:
```
global _start
section .text
_start:
xor al, al
xor bl, bl
xor cl, cl
xor dl, dl
label_01:
inc al
jmp label_02
label_04:
inc dl
jmp label_05
label_03:
inc cl
jmp label_04
label_02:
inc bl
jmp label_03
label_05:
jmp _start
```
The result of single basic block being separated into multiple basic blocks will create a mess on the screen that makes it harder to determine what is happening. We can even make it messier via the following code using the previous technique:
```
global _start
section .text
_start:
xor al, al
xor bl, bl
xor cl, cl
xor dl, dl
label_01:
inc al
jz label_02
jnz label_02
label_04:
inc dl
jz label_05
jnz label_05
label_03:
inc cl
jz label_04
jnz label_04
label_02:
inc bl
jz label_03
jnz label_03
label_05:
jz _start
jnz _start
```
There would be more jumps appearing on the screen and it would be harder to figure out what is happening, for example:
![[binja_screenshot_06.png]]
## Useless code
A common anti RE technique is to have code that is useless, for example:
```
global _start
section .text
_start:
mov al, 7
inc al
dec al
inc al
dec al
add al, 2
sub al, 2
add al, 2
sub al, 3
add al, 4
dec al
dec al
dec al
jmp _start
```
In the code above al gets set to 7 at \_start and is 7 when \_start gets jumped to. The code in between doesn't really matter. The above technique would make it take more time for the person looking at the disassembler to figure out what is happening because they have to trace what is happening to al.
## Clones
A common anti RE technique is to have clones of the same basic block. For example:
```
global _start
section .text
_start:
mov al, 7
cmp al, 7
jz basic_block
inc al
cmp al, 8
jz basic_block
cmp al, 17
jz basic_block
jmp _start
basic_block:
mov al, 77
jmp _start
```
This technique would make the person looking at the disassembler take more time with their task because they have to look at both basic blocks even though they do the same thing, for example:
![[binja_screenshot_07.png]]
## Always/Never jumping to the target
A common anti RE technique is where we have a conditional jump that always jumps to the target. An example is as follows:
```
global _start
section .text
_start:
xor eax, eax
jz will_always_branch
inc al
will_always_branch:
jmp _start
```
Another common anti RE technique is where there is a basic block but it never executes, for example:
```
global _start
section .text
_start:
mov al, 7
cmp al, 77
jz label
jmp _start
label:
xor ax, 777
jmp _start
```
This makes the program's intent harder to determine because the disassembler will show the true and false branch even though it will never go to one of those locations, for example:
![[binja_screenshot_08.png]]
## Call/Ret abuse
The call and ret instructions can be used to confuse disassemblers if used in certain ways because disassemblers don't correctly trace the stack all the time, for example:
```
global _start
section .text
_start:
add rsp, 16
push _start
add qword [rsp], 7
inc qword [rsp]
sub qword [rsp], 7
dec qword [rsp]
call [rsp]
```
Or:
```
global _start
section .text
_start:
push _start
add qword [rsp], 1337
inc qword [rsp]
sub qword [rsp], 1338
xchg rax, [rsp]
mov [rsp], rax
ret
```
The stack can be even harder to trace by using ret with a number after it:
```
global _start
section .text
add_two_numbers:
mov rax, rdi
add rax, rsi
ret 8
add_three_numbers:
mov rax, rdi
add rax, rsi
add rax, rdx
ret
add_four_numbers:
mov rax, rdi
add rax, rsi
add rax, rdx
add rax, rcx
ret 16
_start:
mov edi, 7
mov esi, 7
mov edx, 7
mov ecx, 7
push _start
push 10
push add_three_numbers
push 7
push 9
push add_two_numbers
push _start
push add_four_numbers
ret 8
```
The above code would look like this in a disassembler:
![[binja_screenshot_09.png]]
## Control flow flattening
Control flow flattening is an anti RE technique that replaces the natural execution flow of the program with a dispatcher that executes a certain basic block based on a state variable. The executed basic block eventually leads back to the dispatcher, unless exiting, for example:
```
global _start
section .text
_start:
xor al, al
dispatcher:
test al, al
jz do_thing_01
cmp al, 1
jz do_thing_02
cmp al, 3
jz do_thing_03
jmp dispatcher
do_thing_03:
mov dil, al
mov eax, 60
syscall
do_thing_01:
inc al
jmp dispatcher
do_thing_02:
add al, 2
jmp dispatcher
```
This makes the intent of the program harder to determine because to a disassembler this looks like a single loop with many conditional jumps. The code above would look like this in a disassembler:
![[binja_screenshot_10.png]]
## Opaque predicates
An opaque predicate is a conditional check that always has the same result no matter what when the program is running, but is hard to prove when statically analyzing the program, for example:
```
global _start
section .text
_start:
rdrand eax
lea rbx, [rax + 1]
mul rbx
test rax, 1
jz _start
mov edi, eax
mov eax, 60
syscall
```
Say we have a number x, no matter what x is, x * (x + 1) is always even, the above technique makes the CFG messy by having a useless basic block that never runs. The code above would look like this in a disassembler:
![[binja_screenshot_11.png]]
## Junk code/bytes
In the x86_64 architecture since instructions are just bytes of varying lengths and because there are no explicit instruction boundaries, static analysis isn't always going to be 100% accurate about the instructions it shows the the person looking at the disassembler, however most of the time they will be accurate but junk bytes can be inserted between real instructions leading the disassembler to create a false result of what is happening. These junk bytes will be jumped over and never executed but the disassembler will still have an incorrect picture of what is happening, for example:
```
global _start
section .text
_start:
xor eax, eax
jmp $+4
db 0xff
db 0xeb
test eax, eax
jmp $+5
db 0xeb
db 0xfe
db 0x77
jz _start
db 0xeb
db 0xfe
db 0xfe
db 0xeb
db 0x99
```
A disassembler may display the wrong result while in reality the code simply just sets eax to 0, checks if it is 0, and since it is 0, sets the instruction pointer to \_start. If there are some bytes in memory that represent instructions and depending on where the instruction pointer points to in those bytes, they may represent different instructions, for example:
```
byte 0: mov rax, 7
byte 7: ret
```
The bytes would look like:
```
48 c7 c0 07 00 00 00 c3
```
or
```
byte 0: 48
byte 1: c7
byte 2: c0
byte 3: 07
byte 4: 00
byte 5: 00
byte 6: 00
byte 7: c3
```
Say the instruction pointer started at byte 1 instead of byte 0, the resulting code would be:
```
mov eax, 7
ret
```
If we started at byte 2, the resulting code would be:
```
rol byte [rdi], 0
add byte [rax], al
ret
```
As I said before depending where the instruction pointers to in a group of bytes that are instructions, the bytes may represent different instructions, for example:
```
global _start
section .text
_start:
nop
lea rax, [_start]
xor rax, 0x777
dec rax
rdrand ecx
and ecx, 3
lea rbx, [$+0x21 + rcx]
xor rbx, 0x777
inc rbx
lea rbx, [rbx - 2]
inc rbx
xor rbx, 0x777
jmp rbx
nop
nop
nop
jmp $+4
and r8b, byte [rsp]
sub al, 0x24
inc rax
xor rax, 0x777
push rax
inc qword [rsp]
ret
```
The following code would look like this in a disassembler:
![[binja_screenshot_12.png]]
As you can see by the code above, the instruction pointer is set in between instructions via ``jmp rbx`` and ``jmp 0x401040``. The ``jmp rbx`` instruction will jump to or between 0x401039 and 0x40103c. The bytes in the range of those addresses, including the start and end address, all correspond to valid instructions, with all but one being a nop, while 0x40103c is the start of a jmp. The ``jmp 0x401040`` jumps two bytes after the ``and r8b, byte [rsp]`` instruction, which means that ``and r8b, byte [rsp]`` wont be getting executed but whatever is two bytes after will be getting executed, assuming it is a valid instruction. The bytes of ``and r8b, byte [rsp]`` are equal to:
```
byte 0: 0x44
byte 1: 0x22
byte 2: 0x04
byte 3: 0x24
```
So the instruction pointer would execute whatever is two bytes after byte 0 which is:
```
byte 0: 0x04
```
But that by itself is not a valid instruction however the following bytes together are valid instruction:
```
byte 0: 0x04
byte 1: 0x24
```
Those bytes are equal to: ``add al, 0x24``. After that instruction is executed, ``sub al, 0x24`` is executed which restores al to what is originally was.
## Jump/call tables
A jump/call table is an array of addresses, with each address holding the start of a basic block and is executed indirectly at runtime. For example:
```
global _start
section .text
func01:
mov edi, 7
ret
func02:
mov edi, 77
ret
func03:
mov edi, 777
ret
func04:
mov edi, 7777
ret
_start:
rdrand eax
and eax, 3
call [table + rax * 8]
jmp _start
section .rdata
table dq func01, func02, func03, func04
```
Or:
```
global _start
section .text
func01:
mov edi, 7
jmp _start
func02:
mov edi, 77
jmp _start
func03:
mov edi, 777
jmp _start
func04:
mov edi, 7777
jmp _start
_start:
rdrand eax
and eax, 3
jmp [table + rax * 8]
section .rdata
table dq func01, func02, func03, func04
```
Since the address to call/jump to is selected at runtime, the disassembler cannot determine what basic block will execute next, usually a disassembler will list every possible basic block from the call/jump table that will execute, for example:
![[binja_screenshot_02.png]]
## Computed call/jumps
A computed call/jump is when the program creates an address to call or jump to indirectly via math operations at runtime. Since the address is derived at runtime, the disassembler may have a hard time figuring out where the address leads to, for example:
```
global _start
section .text
_start:
nop
nop
nop
nop
lea rax, [label + 7]
mov cl, 10
loopy_01:
xor ax, 7
inc ax
dec ax
xor ax, 7
dec rax
dec rax
sub rax, 5
test cl, cl
jnz loopy_01
rdrand ebx
and ebx, 3
add rax, rbx
jmp rax
label:
nop
nop
nop
nop
lea rax, [_start + 7]
mov cl, 10
loopy_02:
xor ax, 7
inc ax
dec ax
xor ax, 7
dec rax
dec rax
sub rax, 5
test cl, cl
jnz loopy_02
rdrand ebx
and ebx, 3
add rax, rbx
jmp rax
```
Like I said before, since the address to call/jump to is selected at runtime, the disassembler cannot determine what basic block will execute next, for example:
![[binja_screenshot_03.png]]