Zulip Chat Archive

Stream: general

Topic: Coq as a lean type checker


view this post on Zulip Bas Spitters (Jan 28 2020 at 17:08):

Gaetan Gilbert made a Coq type checker for lean in order the help understand the differences. I think it's very useful in that way.
https://coq.discourse.group/t/alpha-announcement-coq-is-a-lean-typechecker/581

view this post on Zulip Johan Commelin (Jan 28 2020 at 17:24):

That seems like a really thorough comparison. Chapeau!

view this post on Zulip Rob Lewis (Jan 28 2020 at 17:48):

This is cool! I'm not really clear what's meant by this line:

The whole stdlib cannot be checked as some conversion problems are
pathological. two_pos seems a typical example (0 < 2 in an ordered
field).

I'm probably missing something, does anyone know what the issue is with 0 < 2?

view this post on Zulip Johan Commelin (Jan 28 2020 at 17:50):

I just saw that @Gaëtan Gilbert also has an account here. So we can ask him directly :wink:

view this post on Zulip Floris van Doorn (Jan 28 2020 at 17:54):

Yes, this looks really cool indeed. Also interesting to see some small kernel differences between Coq and Lean I have never seen before.

view this post on Zulip Gaëtan Gilbert (Jan 28 2020 at 17:54):

It means that some conversion problem involved in checked the proof of 0<2 is difficult for Coq. I haven't investigated yet, I guess it starts unfolding the wrong thing then gets lost.

view this post on Zulip Floris van Doorn (Jan 28 2020 at 17:56):

I also don't know what the problem is with two_pos.
For the lazy:

lemma two_pos : (2:α) > 0 :=
begin unfold bit0, exact add_pos zero_lt_one zero_lt_one end

and #print two_pos gives

theorem two_pos :  {α : Type u} [_inst_1 : linear_ordered_field α], 2 > 0 :=
λ {α : Type u} [_inst_1 : linear_ordered_field α],
  eq.mpr
    (id
       ((λ [_inst_1 : has_lt α] (a a_1 : α) (e_2 : a = a_1) (b b_1 : α) (e_3 : b = b_1),
           congr (congr_arg gt e_2) e_3)
          2
          (1 + 1)
          (bit0.equations._eqn_1 1)
          0
          0
          (eq.refl 0)))
    (add_pos zero_lt_one zero_lt_one)

Maybe it's the weird identity function generated by unfold that causes problems.

view this post on Zulip Rob Lewis (Jan 28 2020 at 18:00):

Hmm, maybe. I would have guessed that the proof of two_pos doesn't have that much that's even possible to unfold. But Lean isn't happy with #reduce two_pos either.

view this post on Zulip Rob Lewis (Jan 28 2020 at 18:03):

@Kevin Buzzard remember when you used to complain about proving things about numerals? Turns out the proofs are actually hard :p

view this post on Zulip Floris van Doorn (Jan 28 2020 at 18:04):

For me, #reduce @two_pos takes a couple of seconds. There is still a bunch to unfold, with all the substitutions involved in the following definitions (used in two_pos):

#print add_pos
#print add_lt_add_right

Maybe Coq is busy unfolding all these, which doesn't seem to be necessary.

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:04):

yeah my ideas about what maths is "hard" have changed rather radically in the last 2.5 years...

view this post on Zulip Gaëtan Gilbert (Jan 28 2020 at 18:14):

OK I figured out why two_pos is slow
In Coq we try reducing terms before testing if they're proof irrelevant
If I change that order two_pos checks basically instantly

The reason we picked that order is basically that all our benches use only relevant terms (since sprop is still new), and for those it's faster to reduce first

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:14):

What is Coq's sprop?

view this post on Zulip Floris van Doorn (Jan 28 2020 at 18:15):

It's Lean's Prop: a universe that is definitionally proof irrelevant.

view this post on Zulip Floris van Doorn (Jan 28 2020 at 18:15):

Coq's Prop is not definitionally proof irrelevant, but you can add proof irrelevance as an axiom.

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:15):

I see. So the issue is rfl working or not?

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:16):

Hey this is a really cool way to check sprop :-)

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:19):

There will be issues with quotients as well, presumably? I never quite understood the Lean reference manual here: it seems to give a definition of rfl which doesn't seem (to me) to imply that quot.lift f h (quot.mk r a) = f a is defeq.

view this post on Zulip Floris van Doorn (Jan 28 2020 at 18:21):

Yes, quotients are also an issue, but there is a trick in Coq using so-called Private Inductive types. This has also been used to implement higher inductive types in the Coq-HoTT library (a quotient is an example of a higher inductive type).

view this post on Zulip Floris van Doorn (Jan 28 2020 at 18:25):

It's quite a hack, the idea is the following. You privately define the quotient A / ~ to be just A (or a inductive type that is isomorphic to A): then it's very easy to define functions quot.lift and quot.mk such that the rule quot.lift f h (quot.mk r a) = f a holds definitionally. Of course, you don't get the rule that if x ~ x' then [x] = [x'], but you can add that as as an (inconsistent) axiom. Then you declare everything that you don't want to use as "private" (for example the fact that A / ~ was implemented as A), so that you cannot use these things anymore.
The final result is that from outside the private module, everything you can see is exactly the behavior of a quotient.

view this post on Zulip Floris van Doorn (Jan 28 2020 at 18:28):

Here is another explanation: https://dl.acm.org/doi/pdf/10.1145/3018610.3018615?download=true
(Section "Higher Inductive Types")

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:30):

You CS guys are so imaginative ;-)

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:30):

I'll try explaining quotients using this technique to my undergraduates when I teach them next year.

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 18:31):

if it's good enough for the computer, it's good enough for them.

view this post on Zulip Mario Carneiro (Jan 28 2020 at 20:40):

So basically it's controlled inconsistency? :/

view this post on Zulip Mario Carneiro (Jan 28 2020 at 20:42):

@Gaëtan Gilbert Have you looked at all at the reverse process, Coq -> Lean? I've been bugging all coq people I know to tell me how the kernel typechecking algorithm works, because it's not completely documented anywhere outside the ML

view this post on Zulip Mario Carneiro (Jan 28 2020 at 20:43):

Is there an analogue of the Lean output format for Coq?

view this post on Zulip Gaëtan Gilbert (Jan 28 2020 at 21:33):

Coq -> Lean is way harder due to universe cumulativity (I wouldn't be surprised if it's impossible)
Other universe issues:

  • non recursively uniform parameters (AFAICT translating them to regular indices only changes the universe of the inductive)
  • template universe polymorphism (a predecessor of universe polymorphism, kinda underspecified, it has caused repeated inconsistencies but hasn't been removed because it's still in use everywhere through sigma types and list and so on)

If you get past universe issues you also have

  • your pattern matching to eliminator elaboration must be as strong as the termination condition implemented in Coq (not sure how hard this is, I'm not an expert on the termination checker)
  • mutual and nested inductives (I think Lean already has some system to deal with those?)
  • modules. You could just reduce functor applications and treat the resulting flat modules as namespaces (AFAIK the checker does something like that). If module aliases or Include are used you will not have a fun time.

Less used:

  • records with definitional eta (not used in Coq's stdlib)
  • coinductives (not used much afaik)
  • primitive integers and floats (also not used much afaik)

No output format, parsing vo files while ignoring non-kernel stuff may be feasible if you tie yourself to a specific Coq version.

view this post on Zulip Gaëtan Gilbert (Jan 28 2020 at 21:35):

PS maybe using serapi or a custom plugin one could get a reasonable output format

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 22:04):

it has caused repeated inconsistencies but hasn't been removed because it's still in use everywhere through sigma types and list and so on

So to implement this part you have to find an inconsistency in Lean?

view this post on Zulip Gaëtan Gilbert (Jan 28 2020 at 22:11):

There are no known inconsistencies left in it, it's just a historically buggy part of Coq

view this post on Zulip Kevin Buzzard (Jan 28 2020 at 22:57):

On the other hand I have seen people getting into universe messes here -- they are often sorted out in the end but I have often wondered whether cumulative universes would make life easier. I'm a believer in ZFC so I would be happy if everything sensible just lived in Type (other than the things which are "obviously not sets")

view this post on Zulip Mario Carneiro (Jan 29 2020 at 01:16):

I just got to the part of Coq Coq Correct! that deals with the most subtle part of the theory, the guard condition on recursion inside a fixpoint, but they axiomatized it, so there is no definition :(

view this post on Zulip Johan Commelin (Jan 29 2020 at 07:17):

This hit HN front page: https://news.ycombinator.com/item?id=22171305

view this post on Zulip Johan Commelin (Jan 29 2020 at 07:29):

@Kevin Buzzard You're getting some indirect credit. See the first comment.

view this post on Zulip Gaëtan Gilbert (Jan 29 2020 at 13:05):

Updated stdlib stats after fixing the issue examplified by two_pos (we need to check for proof irrelevance before comparing structurally):

Lean Import with 10s timeout, set upfront instances and set conv check:
320s, 1GB RAM
vo size 111MB
11 skips (3 timeout, 8 not instantiated)

The timeouts are bla.equations._eqn_1, where bla in

  • tactic.delta_config.max_steps._default
  • char.of_nat
  • simp.default_max_steps
    These all involve moderately sized integers (of_nat through char.is_valid)

I actually already did a hack for std.prec.max.equations._eqn_1, setting prec.max to the Expand reduction strategy (when comparing 2 applied constants we need to decide which one we try to reduce first, Expand is the most eager strategy). That may be correct for the max_steps but doesn't seem right for of_nat.
What does Lean do here?

view this post on Zulip Mario Carneiro (Jan 29 2020 at 13:12):

Lean unfolds the definition with the largest definitional height, which I believe is the max of the definitional heights of all referenced definitions, plus one

view this post on Zulip Mario Carneiro (Jan 29 2020 at 13:14):

Looking at char.of_nat, I see no expensive unfolding at all

view this post on Zulip Gaëtan Gilbert (Jan 29 2020 at 14:17):

Thanks, that looks like it works. The Lean stdlib now fully passes:

  • 284s runtime, 1.3GB RAM usage
  • resulting vo takes 116MB
  • coqchk takes 540s and 1.1GB RAM
    If we skip structure validation coqchk takes 51s and 720MB RAM
    If we use OCaml's Marshal instead of our custom Analyze coqchk takes 45s and 380MB RAM

view this post on Zulip Bas Spitters (Jan 30 2020 at 14:20):

So basically it's controlled inconsistency? :/

The technique was explained here in agda:
https://homotopytypetheory.org/2011/04/23/running-circles-around-in-your-proof-assistant/
It's a hack to add HITs to a type theory that does not have them. Really, we'd like to have a cubical type theory in the future.

view this post on Zulip Tim Carstens (Jan 31 2020 at 02:43):

:heart:️ to that!


Last updated: May 08 2021 at 03:17 UTC