Documentation

Archive.Arithcc

A compiler for arithmetic expressions #

A formalization of the correctness of a compiler from arithmetic expressions to machine language described by McCarthy and Painter, which is considered the first proof of compiler correctness.

Main definitions #

Expr : the syntax of the source language.
value : the semantics of the source language.
Instruction: the syntax of the target language.
step : the semantics of the target language.
compile : the compiler.

Main results #

compiler_correctness: the compiler correctness theorem.

Notation #

≃[t]/ac: partial equality of two machine states excluding registers x ≥ t and the accumulator.
≃[t] : partial equality of two machine states excluding registers x ≥ t.

References #

John McCarthy and James Painter. Correctness of a compiler for arithmetic expressions. In Mathematical Aspects of Computer Science, volume 19 of Proceedings of Symposia in Applied Mathematics. American Mathematical Society, 1967. http://jmc.stanford.edu/articles/mcpain/mcpain.pdf

Tags #

compiler

Types #

@[reducible, inline]

abbrev Arithcc.Word :

Value type shared by both source and target languages.

Equations

Arithcc.Word = ℕ

Instances For

@[reducible, inline]

abbrev Arithcc.Identifier :

Variable identifier type in the source language.

Equations

Arithcc.Identifier = String

Instances For

@[reducible, inline]

abbrev Arithcc.Register :

Register name type in the target language.

Equations

Arithcc.Register = ℕ

Instances For

theorem Arithcc.Register.lt_succ_self (r : Register) :

r < r + 1

theorem Arithcc.Register.le_of_lt_succ {r₁ r₂ : Register} :

r₁ < r₂ + 1 → r₁ ≤ r₂

Source language #

inductive Arithcc.Expr :

An expression in the source language is formed by constants, variables, and sums.

const (v : Word) : Expr
var (x : Identifier) : Expr
sum (s₁ s₂ : Expr) : Expr

Instances For

instance Arithcc.instInhabitedExpr :

Equations

Arithcc.instInhabitedExpr = { default := Arithcc.Expr.const default }

def Arithcc.value :

Expr → (Identifier → Word) → Word

The semantics of the source language (2.1).

Equations

Arithcc.value (Arithcc.Expr.const v) x✝ = v
Arithcc.value (Arithcc.Expr.var x_2) x✝ = x✝ x_2
Arithcc.value (s₁.sum s₂) x✝ = Arithcc.value s₁ x✝ + Arithcc.value s₂ x✝

Instances For

Target language #

inductive Arithcc.Instruction :

Instructions of the target machine language (3.1--3.7).

Instances For

instance Arithcc.instInhabitedInstruction :

Inhabited Instruction

Equations

Arithcc.instInhabitedInstruction = { default := Arithcc.Instruction.li default }

structure Arithcc.State :

Machine state consists of the accumulator and a vector of registers.

The paper uses two functions c and a for accessing both the accumulator and registers. For clarity, we make accessing the accumulator explicit and use read/write for registers.

ac : Word
rs : Register → Word

Instances For

instance Arithcc.instInhabitedState :

Inhabited State

Equations

Arithcc.instInhabitedState = { default := { ac := 0, rs := fun (x : Arithcc.Register) => 0 } }

def Arithcc.read (r : Register) (η : State) :

This is similar to the c function (3.8), but for registers only.

Equations

Arithcc.read r η = η.rs r

Instances For

def Arithcc.write (r : Register) (v : Word) (η : State) :

This is similar to the a function (3.9), but for registers only.

Equations

Arithcc.write r v η = { ac := η.ac, rs := fun (x : Arithcc.Register) => if x = r then v else η.rs x }

Instances For

def Arithcc.step :

Instruction → State → State

The semantics of the target language (3.11).

Equations

Arithcc.step (Arithcc.Instruction.li v) x✝ = { ac := v, rs := x✝.rs }
Arithcc.step (Arithcc.Instruction.load r) x✝ = { ac := Arithcc.read r x✝, rs := x✝.rs }
Arithcc.step (Arithcc.Instruction.sto r) x✝ = Arithcc.write r x✝.ac x✝
Arithcc.step (Arithcc.Instruction.add r) x✝ = { ac := Arithcc.read r x✝ + x✝.ac, rs := x✝.rs }

Instances For

def Arithcc.outcome :

List Instruction → State → State

The resulting machine state of running a target program from a given machine state (3.12).

Equations

Arithcc.outcome [] x✝ = x✝
Arithcc.outcome (i :: is) x✝ = Arithcc.outcome is (Arithcc.step i x✝)

Instances For

@[simp]

theorem Arithcc.outcome_append (p₁ p₂ : List Instruction) (η : State) :

outcome (p₁ ++ p₂) η = outcome p₂ (outcome p₁ η)

A lemma on the concatenation of two programs (3.13).

Compiler #

def Arithcc.loc (ν : Identifier) (map : Identifier → Register) :

Map a variable in the source expression to a machine register.

Equations

Arithcc.loc ν map = map ν

Instances For

def Arithcc.compile (map : Identifier → Register) :

Expr → Register → List Instruction

The implementation of the compiler (4.2).

This definition explicitly takes a map from variables to registers.

Equations

Arithcc.compile map (Arithcc.Expr.const v) x✝ = [Arithcc.Instruction.li v]
Arithcc.compile map (Arithcc.Expr.var x_2) x✝ = [Arithcc.Instruction.load (Arithcc.loc x_2 map)]
Arithcc.compile map (s₁.sum s₂) x✝ = Arithcc.compile map s₁ x✝ ++ [Arithcc.Instruction.sto x✝] ++ Arithcc.compile map s₂ (x✝ + 1) ++ [Arithcc.Instruction.add x✝]

Instances For

Correctness #

def Arithcc.StateEqRs (t : Register) (ζ₁ ζ₂ : State) :

Machine states ζ₁ and ζ₂ are equal except for the accumulator and registers {x | x ≥ t}.

Equations

(ζ₁ ≃[t]/ac ζ₂) = ∀ (r : Arithcc.Register), r < t → ζ₁.rs r = ζ₂.rs r

Instances For

def Arithcc.«term_≃[_]/ac_» :

Lean.TrailingParserDescr

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem Arithcc.StateEqRs.refl (t : Register) (ζ : State) :

ζ ≃[t]/ac ζ

theorem Arithcc.StateEqRs.symm {t : Register} (ζ₁ ζ₂ : State) :

ζ₁ ≃[t]/ac ζ₂ → ζ₂ ≃[t]/ac ζ₁

theorem Arithcc.StateEqRs.trans {t : Register} (ζ₁ ζ₂ ζ₃ : State) :

ζ₁ ≃[t]/ac ζ₂ → ζ₂ ≃[t]/ac ζ₃ → ζ₁ ≃[t]/ac ζ₃

def Arithcc.StateEq (t : Register) (ζ₁ ζ₂ : State) :

Machine states ζ₁ and ζ₂ are equal except for registers {x | x ≥ t}.

Equations

(ζ₁ ≃[t] ζ₂) = (ζ₁.ac = ζ₂.ac ∧ ζ₁ ≃[t]/ac ζ₂)

Instances For

def Arithcc.«term_≃[_]_» :

Lean.TrailingParserDescr

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem Arithcc.StateEq.refl (t : Register) (ζ : State) :

ζ ≃[t] ζ

theorem Arithcc.StateEq.symm {t : Register} (ζ₁ ζ₂ : State) :

ζ₁ ≃[t] ζ₂ → ζ₂ ≃[t] ζ₁

theorem Arithcc.StateEq.trans {t : Register} (ζ₁ ζ₂ ζ₃ : State) :

ζ₁ ≃[t] ζ₂ → ζ₂ ≃[t] ζ₃ → ζ₁ ≃[t] ζ₃

instance Arithcc.instTransStateStateEqHAddRegisterOfNat (t : Register) :

Trans (StateEq (t + 1)) (StateEq (t + 1)) (StateEq (t + 1))

Equations

Arithcc.instTransStateStateEqHAddRegisterOfNat t = { trans := ⋯ }

theorem Arithcc.StateEqStateEqRs.trans (t : Register) (ζ₁ ζ₂ ζ₃ : State) :

ζ₁ ≃[t] ζ₂ → ζ₂ ≃[t]/ac ζ₃ → ζ₁ ≃[t]/ac ζ₃

Transitivity of chaining ≃[t] and ≃[t]/ac.

instance Arithcc.instTransStateStateEqHAddRegisterOfNatStateEqRs (t : Register) :

Trans (StateEq (t + 1)) (StateEqRs (t + 1)) (StateEqRs (t + 1))

Equations

Arithcc.instTransStateStateEqHAddRegisterOfNatStateEqRs t = { trans := ⋯ }

theorem Arithcc.stateEq_implies_write_eq {t : Register} {ζ₁ ζ₂ : State} (h : ζ₁ ≃[t] ζ₂) (v : Word) :

write t v ζ₁ ≃[t + 1] write t v ζ₂

Writing the same value to register t gives ≃[t + 1] from ≃[t].

theorem Arithcc.stateEqRs_implies_write_eq_rs {t : Register} {ζ₁ ζ₂ : State} (h : ζ₁ ≃[t]/ac ζ₂) (r : Register) (v : Word) :

write r v ζ₁ ≃[t]/ac write r v ζ₂

Writing the same value to any register preserves ≃[t]/ac.

theorem Arithcc.write_eq_implies_stateEq {t : Register} {v : Word} {ζ₁ ζ₂ : State} (h : ζ₁ ≃[t + 1] write t v ζ₂) :

ζ₁ ≃[t] ζ₂

≃[t + 1] with writing to register t implies ≃[t].

theorem Arithcc.compiler_correctness (map : Identifier → Register) (e : Expr) (ξ : Identifier → Word) (η : State) (t : Register) (hmap : ∀ (x : Identifier), read (loc x map) η = ξ x) (ht : ∀ (x : Identifier), loc x map < t) :

outcome (compile map e t) η ≃[t] { ac := value e ξ, rs := η.rs }

The main compiler correctness theorem.

Unlike Theorem 1 in the paper, both map and the assumption on t are explicit.