Zulip Chat Archive
Stream: lean4
Topic: Faster Nat-indexed data structure for kernel reduction
Joachim Breitner (Sep 18 2024 at 14:35):
In proofs by reflection (omega
, bv_decide
) one typically has a closed “problem” with Nat indices, and a separate map from these indices to the actual expression atoms appearing in the proof goal. This requires an “denotation” function that relates the problem to the goal, which needs to be evaluated by the kernel.
In bv_decide
the map is a plain list, and @Henrik Böving mentioned that in (maybe artificial) test cases with many variable this becomes the bottleneck. So we briefly investigated this.
The base line is this code, which generates a large manifest list and measures indexing into it:
import Lean
open Lean
def exp : Nat := 8
def list : List Nat := by
run_tac
let goal ← Elab.Tactic.getMainGoal
let l := List.range (2^exp)
goal.assign (toExpr l)
set_option maxRecDepth 2000
set_option trace.profiler true in
def test1 : ∀ n < 2^exp, list.get? n = Option.some n := by decide
This shows 1.835058
for kernel type-checking.
Clearly, a linear list is not a great data structure to index into. So here is a binary tree that should at least theoretically gives us a logarithmic lookup:
That's 0.325068
for kernel checking in
set_option trace.profiler true in
def test2 : ∀ n < 2^exp, tree.lookup? n = Option.some n := by decide
We can shave off some constants by realizing that structural recursion compiles to a complicated construction using brecOn
etc, so if we define a “kernel-reduction friendly lookup”, like so
noncomputable def BinTree.lookup2? (idx : Nat) : BinTree α → Option α :=
BinTree.rec
.none
(fun pivot x t1 t2 ih1 ih2 =>
if idx < pivot then ih1 else
if idx = pivot then x else
ih2)
Then
set_option trace.profiler true in
def test3 : ∀ n < 2^exp, tree.lookup2? n = Option.some n := by decide
gives 0.282465
.
More data points:
2^n | List.get? | Tree.lookup? | Tree.lookup2? |
---|---|---|---|
4 | 0.010372 | 0.010587 | 0.007030 |
5 | 0.031792 | 0.023548 | 0.018365 |
6 | 0.120239 | 0.054061 | 0.038838 |
7 | 0.482878 | 0.134126 | 0.103754 |
8 | 1.827731 | 0.327895 | 0.264725 |
Take-aways:
- Code that we want the kernel to evaluate quickly should still be “good” functional code, and use suitable data structures.
- To get another 20% of speedup, avoid recursive functions and use combinators directly.
So my question into the room is: Do people in practice have omega
or bv_decide
proofs with a larger number of variables where this optimization would be worth the effort? Or is this all realistic cases dwarfed by other costs (e.g. that of producing and checking the certificate)?
I was told that in particular @Alex Keizer @Siddharth Bhat or @Shilpi Goel might be interested :-)
Notification Bot (Sep 18 2024 at 14:56):
18 messages were moved from this topic to #lean4 > Reflection tactics and unwanted reduction by Joachim Breitner.
Mario Carneiro (Sep 18 2024 at 15:42):
Here's a test using a quickly hacked together proof producing version of your test, and it is indeed faster:
import Lean
import Qq
open Lean
def exp : Nat := 8
def list : List Nat := by
run_tac
let goal ← Elab.Tactic.getMainGoal
let l := List.range (2^exp)
goal.assign (toExpr l)
set_option maxRecDepth 2000
theorem list_head {l : List Nat} {a} : (a::l).get? 0 = some a := rfl
theorem list_tail {l : List Nat} {n a b} (h : l.get? n = some a) :
(b::l).get? n.succ = some a := h ▸ rfl
open Qq
def proveList (n : Nat) (l : Q(List Nat)) : MetaM ((a : Q(Nat)) × Q(($l).get? $n = some $a)) := do
let .app (.app _ (a : Q(Nat))) (l' : Q(List Nat)) := l | failure
show MetaM ((b : Q(Nat)) × Q(($a :: $l').get? $n = some $b)) from do
match n with
| 0 => pure ⟨a, q(list_head)⟩
| n+1 =>
let ⟨b, p⟩ ← proveList n l'
pure ⟨b, q(list_tail $p)⟩
theorem forall_zero {P : Nat → Prop} : ∀ a < 0, P a := by simp
theorem forall_succ {P : Nat → Prop} {n} (H1 : ∀ a < n, P a) (H2 : P n) :
∀ a < n.succ, P a := by
intro a h
cases Nat.lt_succ_iff_lt_or_eq.1 h with
| inl h => exact H1 _ h
| inr h => exact h ▸ H2
def proveForall (P : Q(Nat → Prop))
(H : ∀ a : Nat, MetaM Q($P $a)) (n : Nat) : MetaM (Q(∀ n < $n, $P n)) := do
match n with
| 0 => pure q(forall_zero)
| n+1 =>
let H1 ← proveForall P H n
let H2 ← H n
pure q(forall_succ $H1 $H2)
elab "prove_list_tac" : tactic =>
Elab.Tactic.liftMetaTactic1 fun g => do
let x : Q(Prop) ← g.getType'
let ~q(∀ x < $n, List.get? (α := Nat) $l x = some x) := x | failure
let n ← unsafe Meta.evalExpr Nat q(Nat) n
let p ← proveForall q(fun x => List.get? (α := Nat) $l x = some x) (fun x => (·.2) <$> proveList x l) n
g.assign p
pure none
set_option maxRecDepth 2000
set_option trace.profiler true in
def test1 : ∀ n < 2^exp, list.get? n = Option.some n := by decide
-- 3.301 s
set_option trace.profiler true in
def test2 : ∀ n < 2^exp, list.get? n = Option.some n := by unfold list; prove_list_tac
-- 0.928 s
Mario Carneiro (Sep 18 2024 at 15:43):
you can of course do much better if you exploit structural properties of List.range
here, but this is presumably not the point of the test. This is mimicking the quadratic behavior of the original test
Daniel Weber (Sep 18 2024 at 15:44):
You can also try something like List.forall_ind from here, although I guess that's also not the point of the test
Joachim Breitner (Sep 18 2024 at 15:50):
Hmm, still not quite my point, but interesting indeed!
And this is really creating a proof terms with approx 2^exp * 2^exp/2 constructor applications, one for each step in the function reduction?
Not sure if I should be sad about the reduction performance or happy about the type checking performance :-)
Last updated: May 02 2025 at 03:31 UTC