Adding unary operations
Based on these notes by Ben Lerner. Shortened/simplified a bit, changed the concrete syntax, and cut in two separate notes.
1 Recap and refactoring
Last time, we considered the following miniscule language:
Our abstract syntax was essentially simply
type expr = int64
and our compiler simply placed that integer in the appropriate place in the assembly. But let’s clean up that code somewhat: for a given number (let’s say 4410), we generated the following assembly:
section .text
global our_code_starts_here
our_code_starts_here:
mov RAX, 4410
ret
Of all of that code, only one line corresponds to our input program – the rest is scaffolding. Let’s refactor our compiler into two pieces, as follows:
type reg =
| RAX (* the register where we place answers *)
type arg =
| Const of int64 (* explicit numeric constants *)
| Reg of reg (* any named register *)
type instruction =
| IMov of arg * arg (* Move the value of the right-side arg into the left-arg *)
let asm_to_string (asm : instruction list) : string =
(* do something to get a string of assembly *)
(* compile_expr is responsible for compiling just a single expression,
and does not care about the surrounding scaffolding *)
let compile_expr (e : expr) : instruction list =
[ IMov(Reg(RAX), Const(e)) ]
;;
(* compile_prog surrounds a compiled program by whatever scaffolding is needed *)
let compile_prog (e : expr) : string =
(* compile the program *)
let instrs = compile_expr e in
(* convert it to a textual form *)
let asm_string = asm_to_string instrs in
(* surround it with the necessary scaffolding *)
let prelude = "
section .text
global our_code_starts_here
our_code_starts_here:" in
let suffix = "ret" in
prelude ^ "\n" ^ asm_string ^ "\n" ^ suffix
;;
This is a bit more code than we previously had, but it’s much more usefully
organized: compile_prog
isn’t going to change 1For a little while!
The details of this function will get more elaborate, and we’ll actually wrap
this function in a larger pipeline, but the overall signature and purpose of
the function will remain unchanged., and compile_expr
will simply grow to
accomodate more elaborate expression forms.
2 Growing the language: adding (and subtracting) 1
Every time we enhance our source language, we need to consider several things:
Its impact on the concrete syntax of the language
Examples using the new enhancements, so we build intuition of them
Its impact on the abstract syntax and semantics of the language
Any new or changed transformations needed to process the new forms
Executable tests to confirm the enhancement works as intended
Let’s start by adding increment and decrement operations to our language.
2.1 The new concrete syntax
2.2 Examples
These are not just example programs in the new language, but pairs of example programs and their intended behavior:
Concrete Syntax |
| Answer |
|
|
|
|
|
|
|
|
|
|
|
|
2.3 Enhancing the abstract syntax
Based on the examples above, the semantics for add1
and sub1
should be fairly obvious: they evaluate their argument to a number, and add or
subtract one from it.
The syntax is likewise trivially extended:
type expr =
| Num of int64
| Add1 of expr
| Sub1 of expr
2.4 Enhancing the transformations
To compile addition and subtraction, we need to enhance our knowledge of
assembly. We’ll introduce one new instruction: add <dest>, <val>
will
increment the destination by the right-side value. (This mutates the
destination, so if we still need the old value, we’ll need to have saved it
somewhere else, first.) We’ll correspondingly enhance our definition of
instruction
to represent this new form:
type instruction = ...
| IAdd of arg * arg (* Increment the left-hand arg by the value of the right-hand arg *)
Do Now!
Given this new instruction, work out the desired assembly for the examples above.
Let’s consider the second example: (add1 42)
. To compile this, we
should load 42 into RAX
, and then add 1 to it. Or in symbols,
mov RAX, 42
add RAX, 1
The last example is similar: given
~hl:4:s~(sub1 (~hl:3:s~add1 (~hl:2:s~add1 ~hl:1:s~42~hl:1:e~)~hl:2:e~)~hl:3:e~)~hl:4:e~
,
we want to
load 42, then add 1 to it, then add 1 to that, then subtract 1 from that
result. We currently only have add
, though, so we’ll add -1 instead of subtracting:
~hl:1:s~mov RAX, 42~hl:1:e~
~hl:2:s~add RAX, 1~hl:2:e~
~hl:3:s~add RAX, 1~hl:3:e~
~hl:4:s~add RAX, -1~hl:4:e~
Notice how each piece of the input program corresponds to a related piece of the output assembly.
Our compile_expr
function now looks like this:
let rec compile_expr (e : expr) : instruction list =
match e with
| Num n -> [ IMov(Reg(RAX), Const(n)) ]
| Add1 e -> (* ??? *)
| Sub1 e -> (* ??? *)
Do Now!
Try to complete this scaffolding yourself.
The key observation in the hand-written assembly above is that our translations
are compositional, that is, they recur on their subpieces, and a
translation of a composite expression is simply a function of the translations
of its pieces. Moreover, we know that constants always wind up in RAX
,
and add1
mutates in place, which means that our answers will always be in
RAX
as desired. So our compiler for this language is
let rec compile_expr (e : expr) : instruction list =
match e with
| Num n -> [ IMov(Reg(RAX), Const(n)) ]
| Add1 e -> (compile_expr e) @ [ IAdd(Reg(RAX), Const(1L)) ]
| Sub1 e -> (compile_expr e) @ [ IAdd(Reg(RAX), Const(-1L)) ]
2.5 Testing
Do Now!
Run the given source programs through our compiler pipeline. It should give us exactly the handwritten assembly we intend. If not, debug the compiler until it does.
Exercise
Extend this language with a new operation:
(double expr)
should produce twice the value of the inner expression. Go through the five stages above: concrete syntax, examples, abstract syntax, transformation, and tests. Do we need any new features of the compiler pipeline, or of assembly, in order to achieve this? What if the operation were(halve expr)
instead?
Exercise
There are other assembly instructions we could have added to our output language. The
sub
instruction is the counterpart toadd
, but performs subtraction instead. Theinc
anddec
instructions specifically add or subtract1
. Enhance our definition ofinstruction
to include one or more of these new instructions, and modifycompile_expr
(and any other functions necessary) to take advantage of them.
1For a little while! The details of this function will get more elaborate, and we’ll actually wrap this function in a larger pipeline, but the overall signature and purpose of the function will remain unchanged.