A compiler for arithmetic expressions #
THIS FILE IS SYNCHRONIZED WITH MATHLIB4. Any changes to this file require a corresponding PR to mathlib4.
A formalization of the correctness of a compiler from arithmetic expressions to machine language described by McCarthy and Painter, which is considered the first proof of compiler correctness.
Main definitions #
expr
: the syntax of the source language.value
: the semantics of the source language.instruction
: the syntax of the target language.step
: the semantics of the target language.compile
: the compiler.
Main results #
compiler_correctness
: the compiler correctness theorem.
Notation #
≃[t]/ac
: partial equality of two machine states excluding registers x ≥ t and the accumulator.≃[t]
: partial equality of two machine states excluding registers x ≥ t.
References #
- John McCarthy and James Painter. Correctness of a compiler for arithmetic expressions. In Mathematical Aspects of Computer Science, volume 19 of Proceedings of Symposia in Applied Mathematics. American Mathematical Society, 1967. http://jmc.stanford.edu/articles/mcpain/mcpain.pdf
Tags #
compiler
Types #
Value type shared by both source and target languages.
Equations
Variable identifier type in the source language.
Equations
Register name type in the target language.
Equations
Source language #
- const : arithcc.word → arithcc.expr
- var : arithcc.identifier → arithcc.expr
- sum : arithcc.expr → arithcc.expr → arithcc.expr
An expression in the source language is formed by constants, variables, and sums.
Instances for arithcc.expr
- arithcc.expr.has_sizeof_inst
- arithcc.expr.inhabited
The semantics of the source language (2.1).
Equations
- arithcc.value (s₁.sum s₂) ξ = arithcc.value s₁ ξ + arithcc.value s₂ ξ
- arithcc.value (arithcc.expr.var x) ξ = ξ x
- arithcc.value (arithcc.expr.const v) _x = v
Target language #
- li : arithcc.word → arithcc.instruction
- load : arithcc.register → arithcc.instruction
- sto : arithcc.register → arithcc.instruction
- add : arithcc.register → arithcc.instruction
Instructions of the target machine language (3.1--3.7).
Instances for arithcc.instruction
- arithcc.instruction.has_sizeof_inst
- arithcc.instruction.inhabited
- ac : arithcc.word
- rs : arithcc.register → arithcc.word
Machine state consists of the accumulator and a vector of registers.
The paper uses two functions c
and a
for accessing both the accumulator and registers.
For clarity, we make accessing the accumulator explicit and use read
/write
for registers.
Instances for arithcc.state
- arithcc.state.has_sizeof_inst
- arithcc.state.inhabited
Equations
- arithcc.state.inhabited = {default := {ac := 0, rs := λ (x : arithcc.register), 0}}
This is similar to the c
function (3.8), but for registers only.
Equations
- arithcc.read r η = η.rs r
This is similar to the a
function (3.9), but for registers only.
Equations
- arithcc.write r v η = {ac := η.ac, rs := λ (x : arithcc.register), ite (x = r) v (η.rs x)}
The semantics of the target language (3.11).
Equations
- arithcc.step (arithcc.instruction.add r) η = {ac := arithcc.read r η + η.ac, rs := η.rs}
- arithcc.step (arithcc.instruction.sto r) η = arithcc.write r η.ac η
- arithcc.step (arithcc.instruction.load r) η = {ac := arithcc.read r η, rs := η.rs}
- arithcc.step (arithcc.instruction.li v) η = {ac := v, rs := η.rs}
The resulting machine state of running a target program from a given machine state (3.12).
Equations
- arithcc.outcome (i :: is) η = arithcc.outcome is (arithcc.step i η)
- arithcc.outcome list.nil η = η
A lemma on the concatenation of two programs (3.13).
Compiler #
Map a variable in the source expression to a machine register.
Equations
- arithcc.loc ν map = map ν
The implementation of the compiler (4.2).
This definition explicitly takes a map from variables to registers.
Equations
- arithcc.compile map (s₁.sum s₂) t = arithcc.compile map s₁ t ++ [arithcc.instruction.sto t] ++ arithcc.compile map s₂ (t + 1) ++ [arithcc.instruction.add t]
- arithcc.compile map (arithcc.expr.var x) _x = [arithcc.instruction.load (arithcc.loc x map)]
- arithcc.compile map (arithcc.expr.const v) _x = [arithcc.instruction.li v]
Correctness #
Machine states ζ₁ and ζ₂ are equal except for the accumulator and registers {x | x ≥ t}.
Machine states ζ₁ and ζ₂ are equal except for registers {x | x ≥ t}.
Transitivity of chaining ≃[t]
and ≃[t]/ac
.
Writing the same value to register t
gives ≃[t + 1]
from ≃[t]
.
Writing the same value to any register preserves ≃[t]/ac
.
≃[t + 1]
with writing to register t
implies ≃[t]
.
The main compiler correctness theorem.
Unlike Theorem 1 in the paper, both map
and the assumption on t
are explicit.