OLLVM 原理和去除

这篇文章原始的内容其实是去年学的时候写的,当时当作学 ollvm 混淆原理记的笔记,这学期重新温习的时候又有了写新的发现,把之前的内容完善一下,记录于此。

关于 OLLVM 混淆

  • 产生:字面上来看,ollvm 就是在 llvm 上加了混淆(obfuscator),实际是在 llvm 的 IR 层增加或者替换 pass,将原本用于优化的 IR Pass 改为混淆 IR 的逻辑

LLVM 是一个编译器架构,采用三段式设计,前端可以使用不同的编译工具对代码文件做词法分析以形成抽象语法树 AST,然后将分析好的代码转换成 LLVM 的中间表示IR(intermediate representation);Pass对IR做优化,后端负责将优化好的IR解释成对应平台的机器码 核心思想是先把源码生成为一种与平台无关的中间表示 IR,然后对这层 IR 做优化,最后再生成目标平台如 x86 / ARM 的机器码

  • 几种混淆方式:指令替换、虚假控制流、控制流平坦化、字符串混淆

指令替换 SUB

Instruction Substitution

原理

将原始指令替换为等效但是更加难以理解和还原的复杂指令序列

比如:

  • 加法:a = -(-b + (-c)) ,r = rand (); a = b + r; a = a + c; a = a - r
  • AND:a = b & c => a = (b ^ ~c) & b
  • XOR: a = a ^ b => a = (~a & b) | (a & ~b)

对比

混淆前

混淆后

去除

虚假控制流 BCF

Bogus Control Flow

原理

插入“看起来有分支、但实际永远不会改变程序真实执行路径”的代码,在真实的控制流中加入额外的条件判断(通常依赖恒真/恒假的不透明谓词),把基本块拆分成多个分支路径,但是实际上在运行时只会走其中的一条,而其他的分支的存在打乱了 CFG,增大了静态分析的难度

不透明谓词指在运行时的结果是恒定的,但是静态时反编译器无法判断真假的条件表达式

Eg:

1
2
3
4
5
if ((x * x) >= 0) {
// 分支 A
} else {
// 分支 B(实际上永远不会执行)
}

对于人来说 x*x >= 0 恒成立,永真,一定走 A 分支,但是对于反编译器来说,它看到的是变量计算,不敢轻易优化掉 B 这个分支

对比

混淆前

混淆后

去除

  • D810
  • Sections 改全局变量为只读,给全局变量赋值

Eg

存在多处存在类似结构的条件判断表达式

可以知道 (x.3_28 * (x.3_28 - 1)) 的值恒为偶数,而偶数(最低位是 0) & 1 的值恒为 0,所以 ((x.3_28 * (x.3_28 - 1)) & 1) == 0 恒为真,分析到这一步其实就已经可以判断出整个条件恒为假了

往上看 r9.b 的值取决于 y.40xa 的比较结果 (s< 表示有符号比较)

点进去可以看到 y.4x.3 都是位于 .bss 的全局变量,初始化值为 0

因此 y.4 s< 0xa 恒成立,r9.b 的值恒为 1 -> (((x.3_28 * (x.3_28 - 1)) & 1) == 0 | r9.b) 的值恒为 1 -> ((((x.3_28 * (x.3_28 - 1)) & 1) == 0 | r9.b) & 1) 恒为 1 -> (((((x.3_28 * (x.3_28 - 1)) & 1) == 0 | r9.b) & 1) == 0) 恒假

在 bn 里面可以通过 Set value of variable 手动对变量进行赋值操作,将 x.3_28r9 的值都设置为 0

设置完后分支跳转已经没有了

如果要批量实现的话写脚本自动化去除即可

控制流平坦化 FLA

Control Flow Flattening

原理

核心思想是引入了一个主分发器,将原本具有层次结构的控制流拆分成多个基本块,并且构造出一个状态机,根据 state 的值用 switch 或者跳转表来决定下一步执行哪个基本块,执行完后更新 state 并返回分发器

对比

放一张经典控制流平坦化的图

混淆后

肉眼可见的丑陋…

去除

  • 找真实块
  • 找真实块之间的联系
  • 重建控制流

Eg

找真实块

一般情况下,预处理器的前驱就是真实块,换言之,预处理器也可以称作汇聚块,所有真实块都汇聚于此

通过 CFG 图确定预处理器的地址,遍历基本块,找到预处理器,将其前驱放入真实块列表

并且在遍历基本块的过程中,如果一个块没有后继,那么基本能判定该块为 ret 块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
from binaryninja import *

PRE_BLOCK = 0x402697

# f = current_function
TARGET_FUNC = 0x401e80
f = bv.get_function_at(TARGET_FUNC)

if not f:
raise Exception("[-] function not found")

def end_addr(bb):
a = bb.start
last = a
while a < bb.end:
last = a
info = f.arch.get_instruction_info(bv.read(a, f.arch.max_instr_length), a)
if not info or info.length == 0:
break
a += info.length
return last

tbs, fbs = set(), set()

for bb in f.basic_blocks:
cur = (bb.start, end_addr(bb))

if bb.start == PRE_BLOCK:
fbs.add(cur)
print("[+] find true blocks")
for e in bb.incoming_edges:
p = e.source
tbs.add((p.start, end_addr(p)))
elif not bb.outgoing_edges:
print("[+] find ret block")
tbs.add(cur)
elif bb.start != TARGET_FUNC:
fbs.add(cur)

fbs -= tbs

print("true block:")
print("tbs =", [(hex(s), hex(e)) for s, e in sorted(tbs)])

print("fake block:")
print("fbs =", [(hex(s), hex(e)) for s, e in sorted(fbs)])

bv 就是 BinaryView,是 BN 里最核心的对象之一,表示当前二进制视图,并且提供读取字节,函数,符号,数据变量等能力

basic_block 是程序控制流中最小执行单元,单入口单出口并且中间没有跳转

真实块间的联系(模拟执行)

用 unicorn 模拟执行找到真实块之间的关系

模拟执行 指不用真正运行程序,在“虚拟环境”里执行代码,观察控制流和数据变化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
from unicorn import *
from unicorn.x86_const import *
from capstone import * # 反汇编器接口,用来把机器码解析成指令

BASE = 0x400000
CODE = BASE + 0x0 # 代码段起始地址
CODE_SIZE = 0x100000 # 代码段大小,这里设为 1MB
STACK = 0x7F00000000
STACK_SIZE = 0x100000

uc = Uc(UC_ARCH_X86, UC_MODE_64) # 创建一个 x86-64 的模拟器实例
cs = Cs(CS_ARCH_X86, CS_MODE_64) # 创建反汇编器

tbs_hex = [('0x40223a', '0x40223a'), ('0x40223f', '0x40225a'), ('0x40225f', '0x402295'), ('0x40229a', '0x402332'), ('0x402337', '0x40234d'), ('0x402352', '0x402366'), ('0x40236b', '0x402386'), ('0x40238b', '0x4023a4'), ('0x4023a9', '0x4023c1'), ('0x4023c6', '0x4023e2'), ('0x4023e7', '0x4023fa'), ('0x4023ff', '0x402426'), ('0x40242b', '0x402441'), ('0x402446', '0x402459'), ('0x40245e', '0x402474'), ('0x402479', '0x402483'), ('0x402488', '0x4024a9'), ('0x4024ae', '0x4024d9'), ('0x4024de', '0x402500'), ('0x402505', '0x40251c'), ('0x402521', '0x402537'), ('0x40253c', '0x402550'), ('0x402555', '0x402570'), ('0x402575', '0x40258e'), ('0x402593', '0x4025b5'), ('0x4025ba', '0x4025cd'), ('0x4025d2', '0x4025f9'), ('0x4025fe', '0x402608'), ('0x40260d', '0x402623'), ('0x402628', '0x40263b'), ('0x402640', '0x402656'), ('0x40265b', '0x402665'), ('0x40266a', '0x40268b'), ('0x402690', '0x402696')]
tbs = [(int(s, 16), int(e, 16)) for s, e in tbs_hex]

tb_path = [] # 记录执行路径
main_addr = 0x401E80
main_end = 0x40269C

def hook_code(uc, address, size, user_data):

data = CODE_DATA[address - BASE: address - BASE + size]
for insn in cs.disasm(data, address):
# print(f"{hex(insn.address)}: {insn.mnemonic} {insn.op_str}")

if insn.mnemonic == "call": # 遇到 call 时不真正调用
uc.reg_write(UC_X86_REG_RIP, address + size) # 把 RIP 设置到下一条指令地址,跳过调用
return

elif insn.mnemonic == "ret":
print("find ret block")
print("emu tb_path:")
print(tb_path)
uc.emu_stop()
return

for tb in tbs: # (start, end)
if address == tb[1]: # 当前地址是否是块的结束地址
zf = (uc.reg_read(UC_X86_REG_EFLAGS) >> 6) & 1 # 读取 EFLAGS 寄存器提取第六位,即 ZF
tb_path.append((tb, zf))
break

def hook_mem_invalid(uc, access, address, size, value, user_data): # 访问为映射内存时
rip = uc.reg_read(UC_X86_REG_RIP)
print(f"[mem-invalid] rip={hex(rip)} access={access} addr={hex(address)} size={size} value={hex(value) if isinstance(value, int) else value}")
return False

def hook_intr(uc, intno, user_data): # 执行中断时
rip = uc.reg_read(UC_X86_REG_RIP)
print(f"[intr] rip={hex(rip)} intno={intno}")
uc.emu_stop()

def inituc(uc): # 初始化模拟器环境
uc.mem_map(CODE, CODE_SIZE, UC_PROT_ALL) # 权限设置为可读可写可执行
uc.mem_map(STACK, STACK_SIZE, UC_PROT_ALL)

uc.mem_write(CODE, CODE_DATA)
uc.reg_write(UC_X86_REG_RSP, STACK + STACK_SIZE - 0x1000) # 留出一页栈空间作缓冲区

uc.hook_add(UC_HOOK_CODE, hook_code) # 每执行一条指令时调用 hook_code
uc.hook_add(UC_HOOK_MEM_UNMAPPED, hook_mem_invalid)
uc.hook_add(UC_HOOK_INTR, hook_intr)

with open('./test-fla', 'rb') as f:
CODE_DATA = f.read()

inituc(uc)

try:
uc.emu_start(main_addr, main_end)
except UcError as e:
print(f"Unicorn error: {e}")
重建控制流

把真实块按照正确的顺序连在一起,这里的思路是,对 fbs 末尾的指令统一 nop 掉,对 tbs 逐块检查处理,遍历模拟执行的结果 tb_path,根据 zf 的值加上跳转 jz/jnz

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
from binaryninja import *

f = current_function
arch = bv.arch

pre_block = 0x401E8B

def insn_len(addr):
return bv.get_instruction_length(addr)

def disasm(addr):
return bv.get_disassembly(addr) # 返回该地址的反编译字符串

def nop(addr):
bv.convert_to_nop(addr) # 把地址处指令改成等长的 nop

def write_branch(addr, asm):
old_len = insn_len(addr)
code = arch.assemble(asm, addr) # 汇编->机器码
# if len(code) > old_len:
# return False
bv.write(addr, code + b"\x90" * (old_len - len(code))) # 写新机器码并且用 nop 填充剩余字节
return True

tb_path = [((4203071, 4203098), 0), ((4203103, 4203157), 0), ((4203162, 4203314), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 1), ((4203403, 4203428), 1), ((4203656, 4203689), 1),
((4203694, 4203737), 1), ((4203742, 4203776), 1), ((4203781, 4203804), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 1), ((4203893, 4203918), 1), ((4204138, 4204171), 1)]
tbs = [(4203066, 4203066), (4203071, 4203098), (4203103, 4203157), (4203162, 4203314), (4203319, 4203341), (4203346, 4203366), (4203371, 4203398), (4203403, 4203428), (4203433, 4203457), (4203462, 4203490), (4203495, 4203514), (4203519, 4203558), (4203563, 4203585), (4203590, 4203609), (4203614, 4203636), (4203641, 4203651), (4203656, 4203689), (4203694, 4203737), (4203742, 4203776), (4203781, 4203804), (4203809, 4203831), (4203836, 4203856), (4203861, 4203888), (4203893, 4203918), (4203923, 4203957), (4203962, 4203981), (4203986, 4204025), (4204030, 4204040), (4204045, 4204067), (4204072, 4204091), (4204096, 4204118), (4204123, 4204133), (4204138, 4204171), (4204176, 4204182)]
fbs = [(4202133, 4202159), (4202165, 4202165), (4202170, 4202187), (4202193, 4202193), (4202198, 4202215), (4202221, 4202221), (4202226, 4202243), (4202249, 4202249), (4202254, 4202271), (4202277, 4202277), (4202282, 4202299), (4202305, 4202305), (4202310, 4202327), (4202333, 4202333), (4202338, 4202355), (4202361, 4202361), (4202366, 4202383), (4202389, 4202389), (4202394, 4202411), (4202417, 4202417), (4202422, 4202439), (4202445, 4202445), (4202450, 4202467), (4202473, 4202473), (4202478, 4202495), (4202501, 4202501), (4202506, 4202523), (4202529, 4202529), (4202534, 4202551), (4202557, 4202557), (4202562, 4202579), (4202585, 4202585), (4202590, 4202607), (4202613, 4202613), (4202618, 4202635), (4202641, 4202641), (4202646, 4202663), (4202669, 4202669), (4202674, 4202691), (4202697, 4202697), (4202702, 4202719), (4202725, 4202725), (4202730, 4202747), (4202753, 4202753), (4202758, 4202775), (4202781, 4202781), (4202786, 4202803), (4202809, 4202809), (4202814, 4202831), (4202837, 4202837), (4202842, 4202859), (4202865, 4202865), (4202870, 4202887), (4202893, 4202893), (4202898, 4202915), (4202921, 4202921), (4202926, 4202943), (4202949, 4202949), (4202954, 4202971), (4202977, 4202977), (4202982, 4202999), (4203005, 4203005), (4203010, 4203027), (4203033, 4203033), (4203038, 4203055), (4203061, 4203061), (4204183, 4204183)]

block_info = {}
for s, _ in tbs:
block_info[s] = {"finish": False, "ret":False}

with bv.undoable_transaction(): # with 块里的改动可以撤销
for _, end in fbs:
nop(end)

for start, end in tbs:
bb = f.get_basic_block_at(start)
if bb is None:
continue

dont_patch = False
a = start
while a < bb.end and a <= end: # 遍历当前块指令
text = disasm(a)
mnem = text.split(None, 1)[0] if text else "" # 取指令助记符 "cmovz eax, ebx" -> "cmovz"

if mnem.startswith("cmov"): # cmov 会影响分支跳转
nop(a)
dont_patch = True
elif mnem == "ret":
block_info[start]["ret"] = True
dont_patch = True

a += insn_len(a)

if not dont_patch:
nop(end)
block_info[start]["finish"] = True

write_branch(pre_block, f"jmp {hex(tb_path[0][0][0])}") # pre_block 跳向 tb_path 中第一个块的起始地址

for i in range(len(tb_path) - 1):
(cur_start, cur_end), zf = tb_path[i]
nxt_start = tb_path[i + 1][0][0]

if block_info[cur_start]["finish"] or block_info[cur_start]["ret"]:
continue

fallthrough = cur_end + insn_len(cur_end)
if fallthrough == nxt_start:
continue

op = ("jnz", "jz")[zf]
# zf == 0 对应 jnz, zf == 1 对应 jz
if write_branch(cur_end, f"{op} {hex(nxt_start)}"):
block_info[cur_start]["finish"] = True
else:
print(f"[skip] branch too large @ {hex(cur_end)} -> {hex(nxt_start)}")

模拟执行

Unicorn 就像手动搭一个最小 CPU 运行环境,需要自己准备内存、代码、寄存器和栈,然后让它从某个地址开始跑,并通过 hook 观察执行过程

基本框架

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
from unicorn import *
from unicorn.x86_const import *

# 基本地址布局
BASE = 0x400000
CODE_ADDR = BASE
CODE_SIZE = 0x1000

STACK_ADDR = 0x70000000
STACK_SIZE = 0x10000

# 一小段机器码
# mov eax, 1
# add eax, 2
# ret
CODE = b"\xB8\x01\x00\x00\x00" \
b"\x83\xC0\x02" \
b"\xC3"

# 创建模拟器
uc = Uc(UC_ARCH_X86, UC_MODE_64)

# 映射内存
uc.mem_map(CODE_ADDR, CODE_SIZE)
uc.mem_map(STACK_ADDR, STACK_SIZE)

# 写入代码
uc.mem_write(CODE_ADDR, CODE)

# 初始化栈
rsp = STACK_ADDR + STACK_SIZE - 0x8
uc.reg_write(UC_X86_REG_RSP, rsp)

# 给 ret 准备一个返回地址,否则 ret 会从栈顶取一个地址跳过去,很可能崩
RETURN_ADDR = 0x0
uc.mem_write(rsp, RETURN_ADDR.to_bytes(8, "little"))

# 添加代码执行 hook
def hook_code(uc, address, size, user_data):
print(f"[exec] 0x{address:x}, size={size}")

uc.hook_add(UC_HOOK_CODE, hook_code) # 添加 hook

# 启动模拟
try:
uc.emu_start(CODE_ADDR, CODE_ADDR + len(CODE))
except UcError as e:
print("Emulation error:", e)

eax = uc.reg_read(UC_X86_REG_EAX)
print(f"EAX = {eax}")


OLLVM 原理和去除
http://example.com/2026/03/30/ollvm/
作者
Eleven
发布于
2026年3月30日
许可协议