Zulip Chat Archive
Stream: lean4
Topic: mathport:synth
Daniel Selsam (Mar 20 2021 at 03:18):
FYI I (temporarily) added support for irreducible, and tried re-synthesizing every class in every term+value auto-ported by mathport
. The results are pretty encouraging:
Total: (499227 / 548358) = 0.91
- success: 499227 (0.91)
- failed: 18509 (0.03)
- timeout: 26098 (0.05)
- instExpected: 4524 (0.01)
Other details:
- 50,000 synthInstance.maxHeartbeat
- skipping any terms that took too long to infer the type of (with 1,000 maxHeartbeat)
instExpected
means a subgoal encountered during search is not known to be a class (since currently lean4 only unfolds reducibles in this context)
I have much more detailed data about this experiment, which will hopefully help us track down the issues efficiently. In general, I think this kind of batch experiment is a pretty good way to proceed. We can do similar for all simp
calls, or for trying to delaborate types and (non-proof) values.
Mario Carneiro (Mar 20 2021 at 03:20):
If the lean 3 term contains something like @foo _ _ (non_canonical_instance) ...
will this count as a failure?
Daniel Selsam (Mar 20 2021 at 03:20):
no, it will succeed with the canonical instance, since the experiment script does not check definitional equality of the instances found
Daniel Selsam (Mar 20 2021 at 03:22):
Was I wrong to assume that this happened rarely enough not to have a big effect on the numbers?
Mario Carneiro (Mar 20 2021 at 03:22):
Well there might not be a canonical instance, for example if a local instance is being used
Mario Carneiro (Mar 20 2021 at 03:23):
or haveI
Mario Carneiro (Mar 20 2021 at 03:23):
I think it happens enough that the 0.03% figure is not unreasonable
Mario Carneiro (Mar 20 2021 at 03:23):
I would start from the assumption that all failures are due to something like this
Daniel Selsam (Mar 20 2021 at 03:23):
3%, not 0.03%
Daniel Selsam (Mar 20 2021 at 03:24):
I am confused though -- as long as the binder type is instImplicit
, lean4 should add it to the local synth context
Mario Carneiro (Mar 20 2021 at 03:24):
haveI
is a tactic for when that's not enough
Daniel Selsam (Mar 20 2021 at 03:25):
Because lean3 froze the instances at the colon, right? Lean4 doesn't do that.
Mario Carneiro (Mar 20 2021 at 03:26):
Actually I'm not sure what binder is used by haveI
. Maybe it doesn't use a binder at all, and directly inserts the instance into the term
Mario Carneiro (Mar 20 2021 at 03:27):
you might use haveI
to introduce an instance coming from a def
, which is not an instance because it makes the general instance search go crazy for one reason or another
Daniel Selsam (Mar 20 2021 at 03:28):
Actually, lean4 doesn't even look at the binder-infos, it just checks if the type is a class: https://github.com/leanprover/lean4/blob/master/src/Lean/Meta/Basic.lean#L838-L841
Mario Carneiro (Mar 20 2021 at 03:28):
lean 3 does the same
Mario Carneiro (Mar 20 2021 at 03:29):
binder info only tells it when to trigger a typeclass search when using the theorem
Daniel Selsam (Mar 20 2021 at 03:29):
hold on, I will read the haveI
documentation
Mario Carneiro (Mar 20 2021 at 03:30):
A very crude search found 5500 uses of @foo
Mario Carneiro (Mar 20 2021 at 03:30):
plus 637 haveI
and 755 letI
Mario Carneiro (Mar 20 2021 at 03:31):
also 253 classical,
Daniel Selsam (Mar 20 2021 at 03:32):
Are you sure haveI
isn't only necessary because the local-instances weren't getting updated whenever you went inside a binder? in Lean4, whenever you descend into a pi/lambda, if the binding-domain is a class, you add to the local-instances
Mario Carneiro (Mar 20 2021 at 03:33):
You would actually use resetI
or exactI
if that's the only reason typeclass search is failing
Mario Carneiro (Mar 20 2021 at 03:33):
haveI
is for adding typeclass instances to the context, it's equivalent to have := not_an_instance, resetI
Mario Carneiro (Mar 20 2021 at 03:35):
but since have
doesn't add a binder it's likely that when replaying terms mathport will only see not_an_instance
being used in places where instance search will either fail or find something else
Daniel Selsam (Mar 20 2021 at 03:36):
So we are on the same page, this works in Lean4:
class Foo (α : Type) : Type := (foo : α)
def foo (α : Type) [Foo α] : α := Foo.foo
theorem noHaveI : Unit :=
have h : Foo Unit := ⟨()⟩
foo Unit
Mario Carneiro (Mar 20 2021 at 03:38):
Right, but I think that
theorem noHaveI : unit :=
begin
haveI h : Foo unit := ⟨()⟩,
exact foo ()
end
will produce the term @foo unit ⟨()⟩
which would count as a failure in mathport because the typeclass search ⟨()⟩ : Foo unit
will fail since there are no Foo instances
Daniel Selsam (Mar 20 2021 at 03:40):
#print noHaveI
-- theorem noHaveI : Unit := (fun (h : Foo Unit) => foo Unit) { foo := Unit.unit }
Mario Carneiro (Mar 20 2021 at 03:40):
If haveI
produced the term (\lam [h : Foo unit], @foo () h) ⟨()⟩
then lean 4 would probably pick it up and it would be counted as a success, but I'm pretty sure that's not how it gets compiled
Daniel Selsam (Mar 20 2021 at 03:41):
I will test this, by creating a simple lean3 file, porting it, and running the experiment on it.
Mario Carneiro (Mar 20 2021 at 03:41):
import tactic
class Foo (α : Type) : Type := (foo : α)
def foo (α : Type) [Foo α] : α := Foo.foo
theorem noHaveI : unit :=
begin
haveI h : Foo unit := ⟨()⟩,
exact foo unit
end
set_option pp.all true
#print noHaveI
-- @foo unit (@Foo.mk unit unit.star)
Daniel Selsam (Mar 20 2021 at 03:42):
wow...ok now I understand your concern
Mario Carneiro (Mar 20 2021 at 03:43):
have
in term mode uses the lam-app trick, but have
in tactic mode just creates a delayed substitution
Mario Carneiro (Mar 20 2021 at 03:44):
It's interesting to ask which strategy is better - if terms are internally deduplicated then the delayed substitution approach is probably better, but it's liable to create very large tree terms so the system better be prepared to handle it
Daniel Selsam (Mar 20 2021 at 03:45):
Incidentally, I would have filtered that one anyway. An experimental detail I hadn't thought to post: since inferType
was taking a crazy-long time on some terms, I was only calling it after checking that the app-function was a global instance. But there could still be a non-instance-instance inside the term that would case a failure.
Daniel Selsam (Mar 20 2021 at 03:47):
Also, this one decl was taking an extremely long time to inferType
on one of its subterms, so I blacklisted it: equiv.equiv_congr_refl_left
. I will investigate later but if there is something funky about this term I should be aware of, please let me know
Mario Carneiro (Mar 20 2021 at 03:48):
As in, you can't handle the proof of this theorem, or you can't handle when this theorem appears in other proofs?
Daniel Selsam (Mar 20 2021 at 03:49):
Mario Carneiro said:
have
in term mode uses the lam-app trick, buthave
in tactic mode just creates a delayed substitution
FYI in Lean4, we get the same term in the noHaveI
example above when using tactic mode:
class Foo (α : Type) : Type := (foo : α)
def foo (α : Type) [Foo α] : α := Foo.foo
theorem noHaveI : Unit := by
have h : Foo Unit := ⟨()⟩
exact foo Unit
#print noHaveI
-- theorem noHaveI : Unit := (fun (h : Foo Unit) => foo Unit) { foo := Unit.unit }
Daniel Selsam (Mar 20 2021 at 03:50):
Mario Carneiro said:
As in, you can't handle the proof of this theorem, or you can't handle when this theorem appears in other proofs?
I can't handle a subterm that appears either in the type or the proof of this theorem. Normally it would just heartbeat-timeout, but it must be triggering some rare, expensive computation that isn't checking heartbeats.
Daniel Selsam (Mar 20 2021 at 03:55):
Another good milestone will be for MetaM
to be able to type-check all of mathlib. It is much slower than the kernel, and much less performance-battle-tested.
Mario Carneiro (Mar 20 2021 at 03:55):
Can you delaborate all of mathlib into lean 4 source files?
Daniel Selsam (Mar 20 2021 at 03:59):
Naive delaboration of proofs is a non-starter. I think an ambitious goal for now would be to delaborate (and re-elaborate) definitions and theorem statements.
Mario Carneiro (Mar 20 2021 at 03:59):
I don't see anything suspicious in equiv.equiv_congr_refl_left
except perhaps
universes u_1 u_2 u_3
variables {α : Sort u_1} {β : Sort u_2} {γ : Sort u_3} (bg : equiv β γ) (e : equiv α β)
#check @coe_fn (equiv (equiv α β) (equiv α γ))
(@equiv.has_coe_to_fun (equiv α β) (equiv α γ))
(@equiv.equiv_congr α α β γ (equiv.refl α) bg)
e
which involves applying coe_fn
, which is not obviously a function, to one too many arguments, so that typechecking it requires unfolding equiv.has_coe_to_fun
to ensure that the first component is a function.
Mario Carneiro (Mar 20 2021 at 04:00):
But this should come up whenever you use coe_fn
at all
Daniel Selsam (Mar 20 2021 at 04:01):
I haven't looked at the delaborator though, and Sebastian has (medium-term) plans to improve it. This is why I was thinking of instead experimenting with a heuristic proof-delaborator (trying to recover lets, rewrites, simps, ring, etc)
Mario Carneiro (Mar 20 2021 at 04:01):
Daniel Selsam said:
Naive delaboration of proofs is a non-starter.
Why is this? If the term is valid according to the lean 4 kernel, then it should be possible to delaborate, otherwise we will have the same issue as incompleteness of pp.all in lean 3
Mario Carneiro (Mar 20 2021 at 04:02):
Heuristic delaboration should be built on a foundation of 100% reliable but ugly delaboration
Daniel Selsam (Mar 20 2021 at 04:03):
Yes, we can do the experiment in-process, we just can't write them to files because the tree sizes will be too big.
Daniel Selsam (Mar 20 2021 at 04:05):
Mario Carneiro said:
Heuristic delaboration should be built on a foundation of 100% reliable but ugly delaboration
There are a few modular sub-problems though -- if lean3 annotates with e.g. idSimp
and idRing
, then we can make sure we can delaborate these, ideally into a tiny bit of syntax that the eventually-reliable delaborator can make use of
Daniel Selsam (Mar 20 2021 at 04:07):
Thanks for your help tonight. I am going to sleep now.
Mario Carneiro (Mar 20 2021 at 04:11):
Maybe a good next step would be to backport Expr.mdata
to lean 3, so that we can have general annotations on things without affecting tactics (for the most part). Right now all the annotation support is wrapped up in MacroDef
, which is opaque to lean
Mario Carneiro (Mar 20 2021 at 04:14):
What are you doing with macros anyway? If you are unfolding them then perhaps we can do better by aligning the macros and not unfolding unless necessary.
Daniel Selsam (Mar 20 2021 at 15:10):
I found the source of the infinite loop in equiv.equiv_congr_refl_left
-- the term has wild levels which exposed the following typo/bug: https://github.com/leanprover/lean4/pull/359
Daniel Selsam (Mar 20 2021 at 15:14):
I am unfolding all macros during the tlean
export. I would prefer adding custom id<x>
annotations, that may if necessary include additional arguments with special meta-data that I decode during mathport. Note: I already decode lean3 names during mathport for the autoParam
translation.
Daniel Selsam (Mar 20 2021 at 15:17):
I think that before I even analyze the typeclass experiment, I will try to typecheck all of mathlib in MetaM, because any issues there could manifest in the synth experiment in more subtle ways.
Mario Carneiro (Mar 20 2021 at 15:42):
Interesting. Is that a soundness bug?
Daniel Selsam (Mar 20 2021 at 15:44):
No this is only in MetaM, nothing to do with the kernel
Mario Carneiro (Mar 20 2021 at 15:45):
Regarding mdata
: using id
only works if your metadata is an expr
. Lean 4's mdata
lets you attach strings and numbers and things like that as well
Daniel Selsam (Mar 20 2021 at 15:46):
Yes but we can decode structured data from expressions when we know the type, as I do to decode names for auto-param: https://github.com/dselsam/mathport/blob/master/MathPort/Util.lean#L165-L173
Mario Carneiro (Mar 20 2021 at 15:47):
I mean you can use evalExpr
but that seems inefficient. I know typed user attributes have had issues along these lines
Last updated: Dec 20 2023 at 11:08 UTC