Zulip Chat Archive
Stream: general
Topic: Normalization fails in lean
Mario Carneiro (Nov 18 2019 at 21:01):
Andreas Abel and Thierry Coquand have brought my attention to the following counterexample to normalization in impredicative type theories with proof irrelevance, such as Lean.
def false' := ∀ p : Prop, p def true' := false' → false' def omega : (∀ (A B : Prop), A = B) → false' := λ h A, @cast true' _ (h true' A) $ λ z: false', z true' z def Omega : (∀ (A B : Prop), A = B) → false' := λ h, omega h true' (omega h) #reduce Omega -- timeout
Patrick Massot (Nov 18 2019 at 21:02):
Should we care?
Mario Carneiro (Nov 18 2019 at 21:05):
It is news; this was the last remaining unresolved question in my MS thesis, and I had expected it to be true, so this counterexample is a surprise. Normalization is sometimes used for proofs of consistency, but in lean's case we have a model construction that achieves this goal by alternate means, so that at least is not at risk
Mario Carneiro (Nov 18 2019 at 21:09):
It does, however, put us more firmly in the "classical" camp, because our type theory can't be used to "just compute" anything that doesn't use axioms
Mario Carneiro (Nov 18 2019 at 21:10):
at this point I would prefer to just have a reflection axiom to deduce A
is defeq to B
from A = B
and finish the job
Mario Carneiro (Nov 18 2019 at 21:14):
It does possibly open the door to performing arbitrary TM computations in the lean kernel. We already knew that this was possible in theory, but lean would not actually perform the required sequence of reductions, it would get stuck early on. The difference with this example is that lean actually goes ahead and does the infinite reduction.
Mario Carneiro (Nov 18 2019 at 21:18):
(It's not clear if this can be used to do useful computation, though, as, like the example in my paper, it requires an inconsistent context, so it's tricky to get the computation out into an inhabited context.)
Simon Hudon (Nov 18 2019 at 21:42):
In #reduce Omega -- timeout
, Omega
should be a lambda abstraction. Why does the reduction not stop there?
Bryan Gin-ge Chen (Nov 18 2019 at 21:43):
Are there options to control the number of steps #reduce
takes? Is there any option to trace the reduction steps?
Mario Carneiro (Nov 18 2019 at 21:46):
#reduce
will reduce the whole term to a normal form, it doesn't stop at whnf
Mario Carneiro (Nov 18 2019 at 21:47):
sadly there is no way to trace the reduction
Bryan Gin-ge Chen (Nov 18 2019 at 21:48):
How hard would it be to imitate the reduction in meta
code?
Mario Carneiro (Nov 18 2019 at 21:50):
example (h) : Omega h = Omega h := calc Omega h = omega h true' (omega h) : rfl ... = @cast true' true' (h true' true') (λ z: false', z true' z) (omega h) : rfl ... = (λ z: false', z true' z) (omega h) : rfl ... = (omega h) true' (omega h) : rfl ... = Omega h : rfl
Jeremy Avigad (Nov 18 2019 at 21:54):
Their result is a very pretty result, even though it is unfortunate for Lean. But it is not clear whether this result has any bearing on data, i.e. whether we can get a term of type nat
that doesn't normalize. And even if we can, there is still hope for eval
, which will erase nonsense like Omega.
I don't know what to make of it. Definitional proof irrelevance is nice in practice, and it is hard to imagine someone formalizing mathematics stumbling across a non-normalizing term. I'd rather have a type theory that is theoretically bad but nice in practice than a type theory that is theoretically nice but bad in practice, but of course it would be better to have a theory with both the theoretical and the practical virtues.
Mario Carneiro (Nov 18 2019 at 21:55):
So the question of whether VM evaluation on a computable well typed term can run forever is still open
Mario Carneiro (Nov 19 2019 at 09:46):
Here's a modification of the example to act like a Y combinator:
def false' := ∀ p : Prop, p def true' := false' → false' def y (f : false' → false') (h : ∀ (A B : Prop), A = B) : false' := λ A, @cast true' _ (h true' A) $ λ z: false', f (z true' z) def Y (f : false' → false') (h : ∀ (A B : Prop), A = B) : false' := y f h true' (y f h) example (f h) : Y f h = f (f (f (f (f (f (Y f h)))))) := rfl
Now we can compute with arbitrary functions on false'
:P
Kevin Buzzard (Nov 19 2019 at 09:48):
It does, however, put us more firmly in the "classical" camp, because our type theory can't be used to "just compute" anything that doesn't use axioms
So in fact this is something to celebrate, as far as mathematicians are concerned!
Mario Carneiro (Nov 19 2019 at 10:08):
Embedded lambda calculus:
section parameter (H : ∀ (A B : Prop), A = B) def D := ∀ p : Prop, p def F := D → D def app (a b : D) : D := a F b local infix ` $$ `:400 := app def lam (f : D → D) : D := λ A, @cast F _ (H F A) f local notation `Λ` binders `, ` r:(scoped P, lam P) := r example (f x) : lam f $$ x = f x := rfl def y (f : F) : D := Λ z, f (z $$ z) def Y (f : F) : D := y f $$ y f #reduce lam id def zero : D := Λ f x, x def succ (n : D) : D := Λ f x, f $$ (n $$ f $$ x) def of_nat : ℕ → D | 0 := zero | (n+1) := succ (of_nat n) def plus (m n : D) : D := Λ f x, m $$ f $$ (n $$ f $$ x) example : plus (of_nat 2) (of_nat 2) = of_nat 4 := rfl end
Gaëtan Gilbert (Nov 20 2019 at 10:40):
(It's not clear if this can be used to do useful computation, though, as, like the example in my paper, it requires an inconsistent context, so it's tricky to get the computation out into an inhabited context.)
It doesn't require an inconsistent context (but does require an axiom AFAICT), you can do the same manipulation using an hypothesis ∀ (A B : Prop), A → B → A = B
(propext for inhabited propositions) and instead of inhabiting ∀ p : Prop, p
inhabit ∀ p : Prop, p → p
ie
def top := ∀ p : Prop, p → p def ext := ∀ (A B : Prop), A → B → A = B def supercast (h : ext) (A B : Prop) (a : A) (b : B) : B := @cast A B (h A B a b) a def omega : ext → top := λ h A a, supercast h (top → top) A (λ z: top, z (top → top) (λ x, x) z) a def Omega : ext → top := λ h, omega h (top → top) (λ x, x) (omega h) #reduce Omega -- timeout
See also discussion at https://github.com/coq/coq/pull/10390
Mario Carneiro (Nov 20 2019 at 11:56):
Right, here's another example shared by Andreas Abel:
-- axiom propext {a b : Prop} : (a ↔ b) → a = b def True := ∀ p : Prop, p → p def om : True := λ A a, @cast (True → True) A (propext ⟨λ _, a, λ _, id⟩) $ λ z: True, z (True → True) id z def Om : True := om (True → True) id om -- #reduce Om -- timeout example : Om = Om := calc Om = om (True → True) id om : rfl ... = @cast (True → True) (True → True) (propext ⟨λ _, id, λ _, id⟩) (λ z: True, z (True → True) id z) om : rfl ... = (λ z: True, z (True → True) id z) om : rfl ... = om (True → True) id om : rfl ... = Om : rfl
Mario Carneiro (Nov 20 2019 at 11:59):
More lambda calculus:
def D := ∀ p : Prop, p → p def F := D → D def app (a b : D) : D := a F id b local infix ` $$ `:400 := app def lam (f : D → D) : D := λ A a, @cast F _ (propext ⟨λ _, a, λ _, id⟩) f local notation `Λ` binders `, ` r:(scoped P, lam P) := r example (f x) : lam f $$ x = f x := rfl
Mario Carneiro (Nov 20 2019 at 12:05):
However, it is still difficult to use this for actual computations, because they take place inside proof reduction, which is not something that lean will normally even do anyway. For example, the final example lam f $$ x = f x
is a bit misleading, because even if I wrote lam f $$ x = f (f x)
it would still be proved by rfl
because both sides are proofs. The real demonstration that this encoding works is the observation that the LHS #reduce
s to the RHS:
variables (f : F) (x : D) #reduce lam f $$ x -- f x
Wojciech Nawrocki (Nov 20 2019 at 18:34):
Am I misunderstanding something, or does this mean that Lean's algorithmic equivalence is also undecidable?
Wojciech Nawrocki (Nov 21 2019 at 00:23):
Oh well, it shouldn't be because these are Prop
s, right? But then
def Omega' : pext → top := λ h, (λ p x, x) theorem blah : Omega = Omega' := rfl
goes into a loop in the kernel (deep recursion was detected at 'replace' (potential solution: increase stack space in your system)
). What's going on here - why does Lean seemingly reduce them instead of immediately coming back and saying "these are two proofs of pext → top
, hence the same"?
Mario Carneiro (Nov 21 2019 at 01:35):
Am I misunderstanding something, or does this mean that Lean's algorithmic equivalence is also undecidable?
I think it is decidable, but lean's algorithm doesn't decide it. If you never reduce proofs then I think you might be able to dodge this bullet
Mario Carneiro (Nov 21 2019 at 01:38):
I'm not sure why your example loops. Perhaps lean doesn't notice that they are props? That seems unlikely...
Mario Carneiro (Nov 21 2019 at 14:23):
Abel and Coquand's paper with the original example is on arXiv: https://arxiv.org/abs/1911.08174
Wojciech Nawrocki (Dec 03 2019 at 21:35):
Just came across this: Cedille Cast #9: Impredicativity, proof-irrelevance, and normalization
Chris Hughes (Jan 28 2020 at 23:21):
No idea if this is related or interesting, but I stumbled upon this failing to reduce as well. It reduces if I replace false
with empty
.
def not' (α : Sort*) := α → false prefix ` ¬' `:40 := not' def X {p : Type} : ¬' ¬' (¬' ¬' p → p) := assume h : ¬' (¬' ¬' p → p), h (λ hnnp, (hnnp (λ hp : p, h (λ hnnp, hp))).elim) #reduce X
ice1000 (Jun 29 2021 at 20:50):
What's Lean developers' attitude towards this counterexample of normalization
Mario Carneiro (Jun 29 2021 at 21:01):
This issue occurs in proof normalization, which is not something that matters in practice (we have made efforts to limit/deprecate things requiring proof normalization over time). It is still an open question whether well typed lean terms can cause nonterminating VM computations, which is what we really care about. In any case, it doesn't impact soundness which is what use in mathematics requires of the system.
Mario Carneiro (Jun 29 2021 at 21:06):
Even if VM computations could be made to loop, this is not a significant issue since you can already just turn off the termination checker and write meta
code, and in practice most lean code written for execution is meta
because the tactic monad is meta
Mario Carneiro (Jun 29 2021 at 21:11):
Also, this example is specifically invoking the normalizer (aka #reduce
). It is not an issue in definitional equality checking which is what is needed by the kernel and implemented by external checkers. I don't know how to turn this example into a nonterminating defeq check, because defeq of propositions is trivial. Combined with the fact that "ideal" defeq in lean is already known to be undecidable for other reasons, it does not represent a significant change of the status quo.
Sebastian Ullrich (Jun 29 2021 at 21:12):
Mario Carneiro said:
Even if VM computations could be made to loop, this is not a significant issue since you can already just turn off the termination checker and write
meta
code, and in practice most lean code written for execution ismeta
because the tactic monad ismeta
And also in practice there is not much difference between non-termination and very slow termination. Totality really is mostly interesting for soundness of reduction.
Mario Carneiro (Jun 29 2021 at 21:13):
The conventional wisdom on #reduce
is already "it's stupidly slow and times out on all non-toy examples". A fun example is #reduce 'a'
Mario Carneiro (Dec 20 2023 at 07:23):
Revisiting this old topic to add an important addendum: Above I claimed that this issue is limited to normalization of proofs, which only occurs in #reduce
which is not used in practice. However, because subsingleton eliminators reduce proofs, we can actually use it to lift the counterexample to failure of normalization outside of proofs as well:
/-! Andreas-Abel construction of nontermination in proofs -/
def True' := ∀ p : Prop, p → p
def om : True' := fun A a =>
@cast (True' → True') A (propext ⟨fun _ => a, fun _ => id⟩) <|
fun z => z (True' → True') id z
def Om : True' := om (True' → True') id om
-- #reduce Om -- loops
/-! nontermination outside proofs: -/
inductive Foo : Prop where
| mk : True' → Foo
def foo : Foo := Om _ (Foo.mk fun _ => id)
-- example : foo.recOn (fun _ => 1) = 1 := by rfl -- loops
So this is definitively a counterexample to strong normalization of lean, even with the restricted "algorithmic equality" judgment decided by the lean kernel.
Mario Carneiro (Dec 20 2023 at 07:26):
We can use any subsingleton eliminator in place of Foo
, not just Acc
but even And
is affected:
def foo : And True True := Om _ ⟨trivial, trivial⟩
example : foo.recOn (fun _ _ => 1) = 1 := by rfl -- loops
Mario Carneiro (Dec 20 2023 at 07:27):
(although one has to be careful with these examples because some of them also loop in the elaborator, which is not supposed to be under test here)
Last updated: Dec 20 2023 at 11:08 UTC