凯哥stack

计算机体系结构(二) Code As Data ? Data As Code ?

Modified Harvard Architecture: Clarifying Confusion(原图)

纯冯诺伊曼

纯粹冯诺伊曼结构中,数据和指令共用相同的地址空间,共用相同的内存设备,因此同一时间,内存访问总线要么访问数据,要么访问指令,因此可以在程序中存储数据,同时也可以在数据存储中写入指令并运行,二者没有任何冲突。

  • read data from code: YES
  • execute data as code: YES

纯哈佛结构

纯哈佛结构中,程序和数据地址分开,因此全局变量、常量字符串不能在程序中存储,但是仍然可以在指令中使用立即数,即对寄存器的赋值可以来自于指令所在的地址中保存的值,如:

汇编语言举例

1
MOV A, #10H

C语言举例

1
2
3
4
5
#define MAX_LEN 64
void str_test(char *str, int len) {
int max = MAX_LEN;
...
}
  • read data from code: NO
  • execute data as code: NO

改进的哈佛结构

instruction memory as data

针对第一个改进点,从程序存储中读取数据的研究

以下是通过MCU读取I8B20温度传感器温度后的一段BCD转换的程序,通过在程序存储器中建立一个数字到BCD码的映射表,然后输入数字即可得到对应的BCD码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
BIN_BCD:
MOV DPTR,#TEMP_TAB
MOV A,TEMPER_NUM
MOVC A,@A+DPTR
MOV TEMPER_NUM,A
RET

TEMP_TAB:
DB 00H,01H,02H,03H,04H,05H,06H,07H
DB 08H,09H,10H,11H,12H,13H,14H,15H
DB 16H,17H,18H,19H,20H,21H,22H,23H
DB 24H,25H,26H,27H,28H,29H,30H,31H
DB 32H,33H,34H,35H,36H,37H,38H,39H
DB 40H,41H,42H,43H,44H,45H,46H,47H
DB 48H,49H,50H,51H,52H,53H,54H,55H
DB 56H,57H,58H,59H,60H,61H,62H,63H
DB 64H,65H,66H,67H,68H,69H,70H,71H
DB 72H,73H,74H,75H,76H,77H,78H,79H
DB 80H,81H,82H,83H,84H,85H,86H,87H
DB 88H,89H,90H,91H,92H,93H,94H,95H
DB 96H,97H,98H,99H
  • read data from code: YES
  • execute data as code: NO

DI cache coherence

针对第二个改进点,独立的指令和数据cache的研究

在经典RISC架构处理器PowerPC 604中,针对Self-Modifing Code的用法有一个明确的要求,需要软件手动确保数据cache和指令cache的一致性

Unless specifically noted, the discussion of coherency in this section applies to the L1 data cache and the L2 and L3 caches. The instruction cache is not snooped. Instruction cache coherency must be maintained by software.
Forself-modifying code, the following sequence should be used to synchronize the instruction stream

  1. dcbst or dcbf (push new code from L1 data cache, L2, and L3 cache out to memory)
  2. sync (wait for the dcbst or dcbf to complete)
  3. icbi (invalidate the old instruction cache entry in this processor and, by broadcasting the icbi to the bus, invalidate the entry in all snooping processors)
  4. sync (wait for the icbi to complete its bus operation)
  5. isync (re-sync this processor’s instruction fetch)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
for (i = 0; ; i++) {
if (i > MDF_LOOP && c == 0) {
//modify_code();
asm("bl next\n\
next:
mflr 3\n\
addi 3,3,0x10\n\
lis 4, 0x6000\n\
stw 4, 0(3)\n\
li %0, 1"
: "=r"(c)
:
:"3", "4");
}

if (c)
printf("code modify failed!\n");
}

这段代码的功能是,修改为变量c赋值的语句‘li %0, 1’为nop,后面紧接着判断是否修改成功,在当前语境下,会存在修改失败场景,因为604架构下指令和数据的cache同步需要手动维护,增加如下语句后会成功(代码)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
asm("bl next\n\
next:\n\
mflr 3\n\
addi 3,3,0x24\n\
lis 4, 0x6000\n\
stw 4, 0(3)\n\
dcbst 0,3\n\
sync \n\
icbi 0,3\n\
sync \n\
isync \n\
li %0, 1"
: "=r"(c)
:
: "3", "4");

而在x86_64架构下关于Self-Modifing Code的处理使用自动方式,即由硬件部分保证了cache的一致性,无须软件干预,这无疑为软件编码带来了巨大的便利,经过实测有效:

A write to a memory location in a code segment that is currently cached in the processor causes the associated cache line (or lines) to be invalidated. This check is based on the physical address of the instruction.
In practice, the check on linear addresses should not create compatibility problems among IA-32 processors. Applications that include self-modifying code use the same linear address for modifying and fetching the instruction.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
for (i = 0; ; i++) {
if (i > MDF_LOOP && c == 0) {
//modify_code();
asm("lea 0(%%rip), %%rax\n\
mov $0x0, %%bl\n\
mov %%bl, 6(%%rax)\n\
movl $1, %0"
: "=g"(c)
:
: "memory", "eax", "ebx");
}

if (c)
printf("code modify failed!\n");
}

同样的例子,修改为变量c赋值的语句,无须任何同步语句,每次修改必然生效,做到了指令和数据cache的强一致性。从外在表现上,与冯诺伊曼结构的表现是一致的(代码)。

  • read data from code: YES
  • execute data as code: YES(有些需要手动同步指令)

参考资料:
[1] http://ithare.com/modified-harvard-architecture-clarifying-confusion
[2] https://github.com/kaige86/simplecode

凯哥stack

著作权归作者所有,禁止转载


专题:

本文发表于 2020-06-05,最后修改于 2020-06-07。

本站永久域名kaige86.com,也可搜索「 凯哥stack 」找到我。

期待关注我的 ,查看最近的文章和动态。


上一篇 « 计算机体系结构(一) 三大体系结构浅析 下一篇 » 打造极致省电的Linux Book

推荐阅读

Big Image