这篇的东西比较多。
首先要处理一下inside-out/aux
和inside-out
这两个函数。之前的inside-out/aux
其实一直不支持对progn
的处理,需要先补充;而inside-out
则可以优化一下,避免在只有一个表达式的情况下,也用progn
将其包裹起来。修改后的inside-out/aux
和inside-out
分别如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| (defun inside-out/aux (expr result) "将嵌套的表达式EXPR由内而外地翻出来" (check-type expr list) (cond ((member (first expr) '(+ - * / _exit > exit)) (let ((operands '())) ;; 对参数列表中的所有表达式都递归地进行【外翻】处理 (dolist (arg (rest expr)) (if (listp arg) (let ((var (gensym))) (setf result (inside-out/aux arg result)) (let ((val (pop result))) (push `(setq ,var ,val) result) (push var operands))) (push arg operands))) (push (cons (first expr) (nreverse operands)) result) result)) ((eq (first expr) 'if) (push `(if ,(inside-out (second expr)) ,(inside-out (third expr)) ,(inside-out (fourth expr))) result) result) ((eq (first expr) 'progn) (dolist (e (rest expr)) (push (inside-out e) result)) result) (t (push expr result) result)))
(defun inside-out (expr) (let ((forms (nreverse (inside-out/aux expr '())))) (if (> (length forms) 1) (cons 'progn forms) (car forms))))
|
实际上可以更进一步:inside-out/aux
和inside-out
大可以合并到一起,结果如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| (defun inside-out (expr) "将嵌套的表达式EXPR由内而外地翻出来" (check-type expr list) (cond ((eq (first expr) 'if) `(if ,(inside-out (second expr)) ,(inside-out (third expr)) ,(inside-out (fourth expr)))) ((eq (first expr) 'progn) (cons 'progn (mapcar #'inside-out (rest expr)))) (t (let ((assignments '()) (operands '())) (dolist (arg (rest expr)) (if (listp arg) (let ((val (inside-out arg)) (var (gensym))) (push `(setq ,var ,val) assignments) (push var operands)) (push arg operands))) (if (null assignments) expr `(progn ,@(nreverse assignments) (,(first expr) ,@(nreverse operands))))))))
|
好了,接下来才是本文的重点:如何编译所有的函数调用表达式。
尽管我在上面夸下海口,说要编译“所有”的函数调用表达式,但事实上,现在我还做不到——我只能把所有的函数调用表达式,都映射到对C标准库中的函数的调用。因此,如果想要调用C标准库中的putchar
函数,那么必须写下如下的代码
1
| (|_putchar| #.(char-code #\A))
|
这里用了双竖线的语法来确保这个符号的symbol-name
是全小写的putchar
,开始的下划线是因为在macOS中,调用C函数的时候必须要加上这个前缀的下划线。#.
是个Common Lisp中的reader macro
,可以让后面的表达式在读取期被求值,这样我就不需要手写字母A的code-point啦——好吧,是在炫技。
要编译这种函数调用表达式,只需要模仿一下此前对_exit
的处理就可以啦。首先,是求值函数调用表达式中的各个参数,然后将它们放入恰当的位置中——有的要放入寄存器中,有的要压栈。作为一个野路子的编译器爱好者,我当然是没有正儿八经地看过牙膏厂或者按摩店出品的ABI手册的,我看的是这一份资料:https://www3.nd.edu/~dthain/courses/cse40243/fall2015/intel-intro.html
所以我了解到的是:
- 前六个参数,分别要从左到右地依次放入
RDI
、RSI
、RDX
、RCX
、R8
,以及R9
这些寄存器中的;
- 剩下的参数,通通压栈
然后由于macOS的任性要求,在调用前还需要将RSP
寄存器对齐到16字节的内存地址。我在这里折腾了很久,最后才发现,原来我要在函数调用结束之后,把修改过的RSP
寄存器恢复原状才行_(:з」∠)_
所以,这一部分的代码是这样子的(精简了一下)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| (defun jjcc2 (expr globals) "支持两个数的四则运算的编译器" (check-type globals hash-table) (cond (t (let ((instructions '()) (registers '(%rdi %rsi %rdx %rcx %r8 %r9))) (dotimes (i (length (rest expr))) (if (nth i registers) (push `(movq ,(get-operand expr i) ,(nth i registers)) instructions) (push `(pushq ,(get-operand expr i)) instructions))) `(,@(nreverse instructions) (pushq %rsp) (and ,(format nil "$0x~X" #XFFFFFFFFFFFFFFF0) %rsp) (call ,(first expr)) (popq %rsp))))))
|
先用pushq
把RSP
保存起来,待call
指令结束返回之后,再popq
出来恢复它XD
到这里为止,就可以来写经典的Hello World了,代码如下
1
| (fb `(progn ,@(mapcar #'(lambda (c) `(|_putchar| ,(char-code c))) (coerce "Hello, world!" 'list)) (_exit 0)))
|
生成的汇编代码如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
| .data .section __TEXT,__text,regular,pure_instructions .globl _main _main: MOVQ $72, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $101, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $108, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $108, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $111, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $44, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $32, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $119, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $111, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $114, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $108, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $100, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVQ $33, %RDI PUSHQ %RSP AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _putchar POPQ %RSP MOVL $0, %EDI AND $0xFFFFFFFFFFFFFFF0, %RSP CALL _exit
|
使用GAS编译上述代码,并借助gcc链接后,运行它就可以看到Hello, world!
了
全文完