前言
在逆向的时候遇到过反编译 py 字节码,之前也就没咋在意,啥不会查就完事儿了,好家伙,省赛让我给遇到了,直接嘤嘤嘤?,但还好解出来了;
今天趁这个机会,系统的学习一下,以防下次阴沟里翻船,本博文的 Python 版本是3.8.5,版本不同形成的字节码会略有不同,但是大同小异;
【记】2021年第四届浙江省大学生网络安全技能挑战赛:
- CSDN
- 个人博客
什么是 py 字节码?
Python 代码先被编译为字节码后,再由 Python 虚拟机来执行字节码,Python 的字节码是一种类似汇编指令的中间语言,一个 Python 语句会对应若干字节码指令,虚拟机一条一条执行字节码指令,从而完成程序执行。
Python 的 dis 模块支持对 Python 代码进行反汇编, 生成字节码指令。
结构:
源码行号 | 指令在函数中的偏移 | 指令符号 | 指令参数 | 实际参数值
源码:
str = [88, 117, 124, 124, 127, 48, 71, 127, 98, 124, 116, 48, 61, 61, 121, 116, 33, 32, 100, 62] def test(): for st in str: print(chr(st^16),end='') test()
字节码:
1 0 LOAD_CONST 0 (0) 2 LOAD_CONST 1 (None) 4 IMPORT_NAME 0 (dis) 6 STORE_NAME 0 (dis) 3 8 LOAD_CONST 2 (88) 10 LOAD_CONST 3 (117) 12 LOAD_CONST 4 (124) 14 LOAD_CONST 4 (124) 16 LOAD_CONST 5 (127) 18 LOAD_CONST 6 (48) 20 LOAD_CONST 7 (71) 22 LOAD_CONST 5 (127) 24 LOAD_CONST 8 (98) 26 LOAD_CONST 4 (124) 28 LOAD_CONST 9 (116) 30 LOAD_CONST 6 (48) 32 LOAD_CONST 10 (61) 34 LOAD_CONST 10 (61) 36 LOAD_CONST 11 (121) 38 LOAD_CONST 9 (116) 40 LOAD_CONST 12 (33) 42 LOAD_CONST 13 (32) 44 LOAD_CONST 14 (100) 46 LOAD_CONST 15 (62) 48 BUILD_LIST 20 50 STORE_NAME 1 (str) 5 52 LOAD_CONST 16 (<code object test at 0x0170E2F8, file "1.py", line 5>) 54 LOAD_CONST 17 ('test') 56 MAKE_FUNCTION 0 58 STORE_NAME 2 (test) 9 60 LOAD_NAME 2 (test) 62 CALL_FUNCTION 0 64 POP_TOP 66 LOAD_CONST 1 (None) 68 RETURN_VALUE Disassembly of <code object test at 0x0170E2F8, file "1.py", line 5>: 6 0 LOAD_GLOBAL 0 (str) 2 GET_ITER >> 4 FOR_ITER 24 (to 30) 6 STORE_FAST 0 (st) 7 8 LOAD_GLOBAL 1 (print) 10 LOAD_GLOBAL 2 (chr) 12 LOAD_FAST 0 (st) 14 LOAD_CONST 1 (16) 16 BINARY_XOR 18 CALL_FUNCTION 1 20 LOAD_CONST 2 ('') 22 LOAD_CONST 3 (('end',)) 24 CALL_FUNCTION_KW 2 26 POP_TOP 28 JUMP_ABSOLUTE 4 >> 30 LOAD_CONST 0 (None) 32 RETURN_VALUE
变量
1、CONST
LOAD_CONST 加载 const 变量,比如数值、字符串等等,一般用于传给函数的参数;
11 52 LOAD_NAME 2 (test) 54 LOAD_CONST 16 ('nice') 56 CALL_FUNCTION 1 58 POP_TOP
test('nice')
2、局部变量
LOAD_FAST一般加载局部变量的值,也就是读取值,用于计算或者函数调用传参等;STORE_FAST一般用于保存值到局部变量;
61 77 LOAD_FAST 0 (n) 80 LOAD_FAST 3 (p) 83 INPLACE_DIVIDE 84 STORE_FAST 0 (n)
n = n / p
那问题来了,函数的形参也是局部变量,如何区分出是函数形参还是其他局部变量呢?
我们可以自己写一段代码推敲一下:
import dis str = '' def test(arg): str = 'idi10t' print(arg,str) dis.dis(test)
6 0 LOAD_CONST 1 ('idi10t') 2 STORE_FAST 1 (str) 7 4 LOAD_GLOBAL 0 (print) 6 LOAD_FAST 0 (arg) 8 LOAD_FAST 1 (str) 10 CALL_FUNCTION 2 12 POP_TOP 14 LOAD_CONST 0 (None) 16 RETURN_VALUE
可以得出结论:形参没有初始化,也就是从函数开始到 LOAD_FAST 该变量的位置,如果没有看到 STORE_FAST,那么该变量就是函数形参;而其他局部变量在使用之前肯定会使用 STORE_FAST 进行初始化。
3、全局变量
LOAD_GLOBAL用来加载全局变量,包括指定函数名,类名,模块名等全局符号;STORE_GLOBAL用来给全局变量赋值;
import dis def test(): global str str = 'idi10t' print(str) dis.dis(test)
5 0 LOAD_CONST 1 ('idi10t') 2 STORE_GLOBAL 0 (str) 6 4 LOAD_GLOBAL 1 (print) 6 LOAD_GLOBAL 0 (str) 8 CALL_FUNCTION 1 10 POP_TOP 12 LOAD_CONST 0 (None) 14 RETURN_VALUE
常用数据类型
list
BUILD_LIST 用于创建一个 list 结构:
str = [88, 117, 124, 124, 127, 48, 71, 127, 98, 124, 116, 48, 61, 61, 121, 116, 33, 32, 100, 62]
3 8 LOAD_CONST 2 (88) 10 LOAD_CONST 3 (117) 12 LOAD_CONST 4 (124) 14 LOAD_CONST 4 (124) 16 LOAD_CONST 5 (127) 18 LOAD_CONST 6 (48) 20 LOAD_CONST 7 (71) 22 LOAD_CONST 5 (127) 24 LOAD_CONST 8 (98) 26 LOAD_CONST 4 (124) 28 LOAD_CONST 9 (116) 30 LOAD_CONST 6 (48) 32 LOAD_CONST 10 (61) 34 LOAD_CONST 10 (61) 36 LOAD_CONST 11 (121) 38 LOAD_CONST 9 (116) 40 LOAD_CONST 12 (33) 42 LOAD_CONST 13 (32) 44 LOAD_CONST 14 (100) 46 LOAD_CONST 15 (62) 48 BUILD_LIST 20 50 STORE_NAME 1 (str)
再看看另一种的 list 创建方式:
str = [88, 117, 124, 124, 127, 48, 71, 127, 98, 124, 116, 48, 61, 61, 121, 116, 33, 32, 100, 62] [x for x in str if x != 48]
1 0 LOAD_CONST 0 (88) 2 LOAD_CONST 1 (117) 4 LOAD_CONST 2 (124) 6 LOAD_CONST 2 (124) 8 LOAD_CONST 3 (127) 10 LOAD_CONST 4 (48) 12 LOAD_CONST 5 (71) 14 LOAD_CONST 3 (127) 16 LOAD_CONST 6 (98) 18 LOAD_CONST 2 (124) 20 LOAD_CONST 7 (116) 22 LOAD_CONST 4 (48) 24 LOAD_CONST 8 (61) 26 LOAD_CONST 8 (61) 28 LOAD_CONST 9 (121) 30 LOAD_CONST 7 (116) 32 LOAD_CONST 10 (33) 34 LOAD_CONST 11 (32) 36 LOAD_CONST 12 (100) 38 LOAD_CONST 13 (62) 40 BUILD_LIST 20 42 STORE_NAME 0 (str) 3 44 LOAD_CONST 14 (<code object <listcomp> at 0x016FE2F8, file "1.py", line 3>) 46 LOAD_CONST 15 ('
'
) 48 MAKE_FUNCTION 0 50 LOAD_NAME 0 (str) 52 GET_ITER 54 CALL_FUNCTION 1 56 POP_TOP 58 LOAD_CONST 16 (None) 60 RETURN_VALUE Disassembly of <code object <listcomp> at 0x016FE2F8, file "1.py", line 3>: 3 0 BUILD_LIST 0 # 创建 list,为赋值给某变量 2 LOAD_FAST 0 (.0) >> 4 FOR_ITER 16 (to 22) 6 STORE_FAST 1 (x) 8 LOAD_FAST 1 (x) 10 LOAD_CONST 0 (48) 12 COMPARE_OP 3 (!=) 14 POP_JUMP_IF_FALSE 4 # 不满足条件则 break 16 LOAD_FAST 1 (x) # 读取满足条件的 x 18 LIST_APPEND 2 # 把每个满足条件的 x 存入 list 20 JUMP_ABSOLUTE 4 >> 22 RETURN_VALUE
dict
BUILD_MAP用于创建一个空的 dict;STORE_NAME用于初始化 dict 的内容;
str = {
'name' : 'id10t'} str['age'] = 3
1 0 LOAD_CONST 0 ('name') 2 LOAD_CONST 1 ('id10t') 4 BUILD_MAP 1 6 STORE_NAME 0 (str) 2 8 LOAD_CONST 2 (3) 10 LOAD_NAME 0 (str) 12 LOAD_CONST 3 ('age') 14 STORE_SUBSCR 16 LOAD_CONST 4 (None) 18 RETURN_VALUE
slice
这里直接借用了大佬博文的数据;
BUILD_SLICE 用于创建 slice,对于 list、元组、字符串都可以使用 slice 的方式进行访问。
但是要注意 BUILD_SLICE 用于 [x:y:z] 这种类型的 slice,结合 BINARY_SUBSCR 读取 slice 的值,结合 STORE_SUBSCR 用于修改 slice 的值。
另外 SLICE + n 用于 [a:b] 类型的访问,STORE_SLICE + n 用于 [a:b] 类型的修改,其中 n 表示如下:
SLICE+0() Implements TOS = TOS[:]. SLICE+1() Implements TOS = TOS1[TOS:]. SLICE+2() Implements TOS = TOS1[:TOS]. SLICE+3() Implements TOS = TOS2[TOS1:TOS].
13 0 LOAD_CONST 1 (1) 3 LOAD_CONST 2 (2) 6 LOAD_CONST 3 (3) 9 BUILD_LIST 3 12 STORE_FAST 0 (k1) //k1 = [1, 2, 3] 14 15 LOAD_CONST 4 (10) 18 BUILD_LIST 1 21 LOAD_FAST 0 (k1) 24 LOAD_CONST 5 (0) 27 LOAD_CONST 1 (1) 30 LOAD_CONST 1 (1) 33 BUILD_SLICE 3 36 STORE_SUBSCR //k1[0:1:1] = [10] 15 37 LOAD_CONST 6 (11) 40 BUILD_LIST 1 43 LOAD_FAST 0 (k1) 46 LOAD_CONST 1 (1) 49 LOAD_CONST 2 (2) 52 STORE_SLICE+3 //k1[1:2] = [11] 16 53 LOAD_FAST 0 (k1) 56 LOAD_CONST 1 (1) 59 LOAD_CONST 2 (2) 62 SLICE+3 63 STORE_FAST 1 (a) //a = k1[1:2] 17 66 LOAD_FAST 0 (k1) 69 LOAD_CONST 5 (0) 72 LOAD_CONST 1 (1) 75 LOAD_CONST 1 (1) 78 BUILD_SLICE 3 81 BINARY_SUBSCR 82 STORE_FAST 2 (b) //b = k1[0:1:1]
循环
while
Python3.8 及以上就没有 SETUP_LOOP 了,

大致意思就是将循环块送入到堆栈当中去,
i = 0 while i < 10: i += 1
1 0 LOAD_CONST 0 (0) 2 STORE_NAME 0 (i) 2 >> 4 LOAD_NAME 0 (i) 6 LOAD_CONST 1 (10) 8 COMPARE_OP 0 (<) 10 POP_JUMP_IF_FALSE 22 3 12 LOAD_NAME 0 (i) 14 LOAD_CONST 2 (1) 16 INPLACE_ADD 18 STORE_NAME 0 (i) 20 JUMP_ABSOLUTE 4 >> 22 LOAD_CONST 3 (None) 24 RETURN_VALUE
for
Python 中典型的 for in 结构:
for i in range(8):
2 4 LOAD_NAME 1 (range) 6 LOAD_CONST 1 (8) 8 CALL_FUNCTION 1 10 GET_ITER >> 12 FOR_ITER 38 (to 52) 14 STORE_NAME 2 (i) ... 50 JUMP_ABSOLUTE 12 >> 52 LOAD_CONST 6 (None)
if
POP_JUMP_IF_FALSE 和 JUMP_FORWARD 一般用于分支判断跳转:
POP_JUMP_IF_FALSE表示条件结果为FALSE就跳转到目标偏移指令;JUMP_FORWARD直接跳转到目标偏移指令;
i = 0 if i < 5: print('i < 5') elif i > 5: print('i > 5') else: print('i = 5')
1 0 LOAD_CONST 0 (0) 2 STORE_NAME 0 (i) 2 4 LOAD_NAME 0 (i) 6 LOAD_CONST 1 (5) 8 COMPARE_OP 0 (<) 10 POP_JUMP_IF_FALSE 22 3 12 LOAD_NAME 1 (print) 14 LOAD_CONST 2 ('i < 5') 16 CALL_FUNCTION 1 18 POP_TOP 20 JUMP_FORWARD 26 (to 48) 4 >> 22 LOAD_NAME 0 (i) 24 LOAD_CONST 1 (5) 26 COMPARE_OP 4 (>) 28 POP_JUMP_IF_FALSE 40 5 30 LOAD_NAME 1 (print) 32 LOAD_CONST 3 ('i > 5') 34 CALL_FUNCTION 1 36 POP_TOP 38 JUMP_FORWARD 8 (to 48) 7 >> 40 LOAD_NAME 1 (print) 42 LOAD_CONST 4 ('i = 5') 44 CALL_FUNCTION 1 46 POP_TOP >> 48 LOAD_CONST 5 (None) 50 RETURN_VALUE
其他指令
上述就是比较常用的一些指令了,当然还有更多的指令,这里就不一一介绍了,详情见官方文档,这里的是 Python3.8 版本的官方文档;
后记
开卷有益,多多益善;
参考:
发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/202171.html原文链接:https://javaforall.net
