Zulip Chat Archive

Stream: new members

Topic: vm check failed

Bhavik Mehta (Jun 28 2020 at 00:14):

import data.list.range

def burgerfy (n : ℕ) : string :=
begin
  apply nat.strong_rec_on n,
  intros k h,
  apply "{" ++ string.intercalate "," (list.map _ (list.fin_range k)) ++ "}",
  intro m,
  apply h m.1 m.2,
end

#eval (burgerfy 1)
#eval (burgerfy 2)

The first works and the second fails with the error message "vm check failed: is_closure(o) (possibly due to incorrect axioms, or sorry)" but I'm not using sorry and only using mathlib/core functions, why is this?

Bhavik Mehta (Jun 28 2020 at 00:16):

Also, this is a def which isn't noncomputable, so I expected it to compute fine

Mario Carneiro (Jun 28 2020 at 03:12):

There appears to be a bug in the VM implementation of nat.strong_rec_on:

#eval @nat.strong_rec_on (λ _, ℕ) 0 (λ n ih, if h : n - 1 < n then ih _ h + 1 else 0) -- 0
#eval @nat.strong_rec_on (λ _, ℕ) 1 (λ n ih, if h : n - 1 < n then ih _ h + 1 else 0) -- 1
#eval @nat.strong_rec_on (λ _, ℕ) 2 (λ n ih, if h : n - 1 < n then ih _ h + 1 else 0) -- 2
#eval @nat.strong_rec_on (λ _, ℕ) 3 (λ n ih, if h : n - 1 < n then ih _ h + 1 else 0) -- vm check failed: cidx(closure) == 0 (possibly due to incorrect axioms, or sorry

Bhavik Mehta (Jun 28 2020 at 03:13):

yay I found a vm bug

Mario Carneiro (Jun 28 2020 at 03:16):

A workaround is to not use strong_rec_on to write wf definitions for the VM

Mario Carneiro (Jun 28 2020 at 03:16):

def burgerfy : ℕ → string
| n := "{" ++ string.intercalate "," ((list.fin_range n).map (λ ⟨m, h⟩, burgerfy m)) ++ "}"

This generally performs better anyway

Mario Carneiro (Jun 28 2020 at 03:17):

But this is the first time I have ever seen a VM bug so congrats

Scott Olson (Jun 28 2020 at 03:18):

Do you know if the VM has a specialized version of nat.strong_rec_on, or does it compile its Lean definition like usual?

Mario Carneiro (Jun 28 2020 at 03:18):

I think it just compiles the regular definition

Scott Olson (Jun 28 2020 at 03:18):

I might try shrinking it to find the cause

Bhavik Mehta (Jun 28 2020 at 03:18):

Oh that's the version I was looking for! I tried using the equation compiler but couldn't satisfy it, so went straight to strong_rec_on in tactic mode

Mario Carneiro (Jun 28 2020 at 03:20):

universe u
open nat
def strong_rec_on₂ {p : nat → Sort u} (n : nat) (h : ∀ n, (∀ m, m < n → p m) → p n) : p n :=
suffices ∀ n m, m < n → p m, from this (succ n) n (lt_succ_self _),
begin
  intros n, induction n with n ih,
    {intros m h₁, exact absurd h₁ (not_lt_zero _)},
    {intros m h₁,
      apply or.by_cases (lt_or_eq_of_le (le_of_lt_succ h₁)),
        {intros, apply ih, assumption},
        {intros, subst m, apply h _ ih}}
end

#eval @strong_rec_on₂ (λ _, ℕ) 3 (λ n ih, if h : n - 1 < n then ih _ h + 1 else 0) -- vm check failed: cidx(closure) == 0 (possibly due to incorrect axioms, or sorry

still fails

Bhavik Mehta (Jun 28 2020 at 03:25):

Huh, I'm surprised there's a difference between

import data.list.range

def burgerfy : ℕ → string
| 0 := "nothing burger"
| n := string.intercalate " and " ((list.fin_range n).map (λ ⟨m, h⟩, burgerfy m)) ++ " burger"

and

import data.list.range

def burgerfy : ℕ → string
| 0 := "nothing burger"
| n := string.intercalate " and " ((list.fin_range n).map (λ ⟨m, _⟩, burgerfy m)) ++ " burger"

Mario Carneiro (Jun 28 2020 at 03:36):

That's because default_dec_tac automatically clears hypotheses with "internal" names (meaning initial underscore)

Scott Olson (Jun 28 2020 at 03:39):

After picking apart nat.strong_rec_on a bit, I reached a step where I replaced a nat.rec with a pattern match and it seemingly fixed it. I'll try to investigate the difference

Scott Olson (Jun 28 2020 at 03:40):

Hmm, the pattern matching elaborator generates much more complicated code than a simple nat.rec.

Mario Carneiro (Jun 28 2020 at 03:45):

def bar (n : ℕ) (ih : ∀ m, m < n → ℕ) : ℕ := if h : n - 1 < n then ih _ h + 1 else 0
def foo : ∀ n m : ℕ, m < n → nat
| 0 := λ m h₁, absurd h₁ (nat.not_lt_zero _)
| (n+1) := λ m h₁, if h : m < n then foo _ _ h else bar n (foo _)

def foo' (n) : ∀ m : ℕ, m < n → nat :=
nat.rec_on n (λ m h₁, absurd h₁ (nat.not_lt_zero _))
  (λ n IH m h₁, if h : m < n then IH _ h else bar n IH)

#eval @foo 3 2 dec_trivial -- 2
#eval @foo' 3 2 dec_trivial -- vm check failed: cidx(closure) == 0 (possibly due to incorrect axioms, or sorry

Scott Olson (Jun 28 2020 at 03:46):

Nice

Mario Carneiro (Jun 28 2020 at 03:46):

The VM has special support for definitions using the equation compiler

Mario Carneiro (Jun 28 2020 at 03:47):

But in any case it's different code so it's not particularly surprising that it doesn't behave the same; there probably isn't too much point in examining the pattern match code to find out why it works

Scott Olson (Jun 28 2020 at 03:47):

Yeah, I had switched to simplifying the nat.rec code instead, but you came up with something simpler anyway

Mario Carneiro (Jun 28 2020 at 03:48):

I would be surprised if there was a difference between nat.rec and nat.rec_on here; they both get special support

Mario Carneiro (Jun 28 2020 at 03:52):

def bar (n : ℕ) (IH : ℕ → ℕ) : ℕ := IH n + 1
def foo (n) : ℕ → ℕ := nat.rec_on n (λ m, 0) (λ n IH m, bar n IH)
def foo' (n) : ℕ → ℕ := nat.rec_on n (λ m, 0) (λ n IH m, IH n + 1)

#eval @foo 3 2 -- vm check failed: cidx(closure) == 0 (possibly due to incorrect axioms, or sorry
#eval @foo' 3 2 -- 3

Mario Carneiro (Jun 28 2020 at 04:03):

def bar (n : ℕ) (IH : ℕ → ℕ) : ℕ := IH n + 1
set_option trace.compiler.optimize_bytecode true
def foo (n) : ℕ → ℕ := nat.rec_on n (λ m, 0) (λ n IH m, bar n IH)

[compiler.optimize_bytecode]  foo._rec_1 1
0: move 0
1: nat_cases 4
2: scnstr #0
3: ret
4: push 1
5: ginvoke foo._rec_1
6: move 1
7: ginvoke bar
8: drop 1
9: ret
[compiler.optimize_bytecode]  foo 2
0: move 1
1: ginvoke foo._rec_1
2: ret

Mario Carneiro (Jun 28 2020 at 04:15):

This compilation is a bit odd. As a primer for those who don't know, the main function foo 2 here indicates that foo is a function with two arguments. It calls move 1, pushing argument 1 (I think this is n) on the stack, and then calls foo._rec_1, which is the recursive function that is constructed as part of nat.rec_on. This could normally be a closure, depending on arguments outside the nat.rec_on, but in this case we don't reference any external variables so foo._rec_1 has only one argument, the index. You can think of it as being defined like this:

def func := ℕ → ℕ
meta def foo.rec_1 : ℕ → func | n :=
nat.cases_on n
-- 0: move 0
-- 1: nat_cases 4
( -- 2: scnstr #0
  -- 3: ret
  λ _, 0)
(λ m,
  -- 4: push 1
  -- 5: ginvoke foo._rec_1
  let IH := foo.rec_1 m in
  -- 6: move 1
  -- 7: ginvoke bar
  let x := bar n IH in
  -- 8: drop 1
  -- 9: ret
  λ _, x)

Mario Carneiro (Jun 28 2020 at 04:16):

The thing that is weird is that there doesn't seem to be any indication of the second lambda; foo.rec_1 is supposed to be producing a function, not a natural number, and I've fudged it in the gloss by adding λ _, in two places

Mario Carneiro (Jun 28 2020 at 04:18):

an even simpler version:

def bar (n : ℕ) (IH : unit → ℕ) : ℕ := IH () + 1
set_option trace.compiler.optimize_bytecode true
def foo (n) : unit → ℕ := nat.rec_on n (λ _, 0) (λ n IH _, bar n IH)
#eval @foo 3 () -- vm check failed: cidx(closure) == 0 (possibly due to incorrect axioms, or sorry

The +1 only matters, by the way, in order to cause the crash by doing something that requires the result to be a nat

Scott Olson (Jun 28 2020 at 04:20):

It also fails at @foo 2 () now for what it's worth

Mario Carneiro (Jun 28 2020 at 04:27):

def bar (IH : unit → ℕ) : ℕ := IH () + 0
set_option trace.compiler.optimize_bytecode true
def foo (n) : unit → ℕ := nat.rec_on n (λ _, 1) (λ n IH _, bar IH)
meta def foo' : ℕ → unit → ℕ | n _ := nat.cases_on n 1 (λ n, bar (foo' n))
#eval @foo 1 () -- vm check failed: cidx(closure) == 0 (possibly due to incorrect axioms, or sorry
#eval @foo' 1 () -- 1

I think this gets to the heart of it. The meta def is very close to the final compiled form, but you will notice that the recursive function after compilation has 2 arguments instead of 1 and is called using closure instead of ginvoke

Mario Carneiro (Jun 28 2020 at 04:28):

The compiler has some logic in it to eliminate curried functions and remove erased arguments, and I think the bug is there

Scott Olson (Jun 28 2020 at 04:43):

Do you think it comes from this function? https://github.com/leanprover-community/lean/blob/7b2f3fa452e4bdb38c5d96e64d29205c5986a1a8/src/library/compiler/elim_recursors.cpp#L97-L99

Scott Olson (Jun 28 2020 at 04:44):

Line 119 makes it look like this deals with the ginvoke/closure distinction you mentioned

Mario Carneiro (Jun 28 2020 at 05:02):

I think it's happening in an optimization pass. The original expression nat.rec_on n (λ _, 1) (λ n IH _, bar IH) is translated to nat.cases_on n (λ _, 1) (λ n _, bar (self n)) and then it notices that we have a cases followed by a lambda and lifts one over the other to get λ _, nat.cases_on n (1) (λ n, bar (self n)), whereupon we can deduce that the ignored argument can be eliminated. But eliminating this argument changes the type of self n, so bar ends up getting passed a value of type nat instead of unit -> nat

Mario Carneiro (Jun 28 2020 at 05:03):

The reason it works with 0 in place of 1 in the base case is that 0 plays many roles in the VM; it is the neutral value meaning that all proofs become 0, and in particular application 0 x is defined and equals 0

Scott Olson (Jun 28 2020 at 05:03):

def bar (n : ℕ) (IH : unit → ℕ) : ℕ := IH () + 1
set_option trace.compiler.elim_recursors true
set_option trace.compiler.reduce_arity true
def foo (n) : unit → ℕ := nat.rec_on n (λ _, 0) (λ n IH _, bar n IH)

[compiler.elim_recursors]
>> foo._rec_1
λ (n : ℕ) (a : unit), n.cases_on (λ (_x : unit), 0) (λ (n : ℕ) (_x : unit), bar n ([foo._rec_1] n)) a
>> foo
λ (n : ℕ) (a : unit), foo._rec_1 n a
[compiler.reduce_arity]
>> foo._rec_1
λ (n : ℕ), nat.cases_on n 0 (λ (n : ℕ), bar n (foo._rec_1 n))
>> foo
λ (n : ℕ) (a : unit), foo._rec_1 n

Mario Carneiro (Jun 28 2020 at 05:04):

aha, as I suspected

Scott Olson (Jun 28 2020 at 05:04):

Like you said, after the reduce_arity pass it's not type-correct on bar's argument

Scott Olson (Jun 28 2020 at 05:07):

My mistake, it's the erase_irrelevant pass

Scott Olson (Jun 28 2020 at 05:07):

Those traces are actually next to each other so there's no more doubt:

        lean_trace(name({"compiler", "elim_recursors"}), tout() << "\n"; display(procs););
        erase_irrelevant(procs);
        lean_trace(name({"compiler", "erase_irrelevant"}), tout() << "\n"; display(procs););

Mario Carneiro (Jun 28 2020 at 05:10):

The erase_irrelevant pass still looks okay though

Mario Carneiro (Jun 28 2020 at 05:10):

[compiler.erase_irrelevant]
>> foo._rec_1
λ (n : ℕ) (a : unit), nat.cases_on n 0 (λ (n : ℕ), bar (foo._rec_1 n))
>> foo
λ (n : ℕ) (a : unit), foo._rec_1 n a

Mario Carneiro (Jun 28 2020 at 05:11):

It has done the lambda-over-cases transformation, which is not itself harmful

Mario Carneiro (Jun 28 2020 at 05:11):

notice that foo._rec_1 is still accepting two arguments and passes a partially applied version of itself to bar

Scott Olson (Jun 28 2020 at 05:12):

Ah! I didn't read carefully enough

Mario Carneiro (Jun 28 2020 at 05:15):

https://github.com/leanprover-community/lean/blob/7b2f3fa452e4bdb38c5d96e64d29205c5986a1a8/src/library/compiler/reduce_arity.cpp#L48-L53

Mario Carneiro (Jun 28 2020 at 05:15):

I think this is where the action is

Mario Carneiro (Jun 28 2020 at 05:16):

In particular it says "we try to eta expand applications" which is the right idea but isn't being implemented right here?

Last updated: May 02 2025 at 03:31 UTC