Zulip Chat Archive

Stream: Program verification

Topic: Need a way to unfold a definition


Daniel Fabian (Jul 04 2020 at 15:23):

Sorry for bringing up the stupid example again. I'm just annoyingly stuck and it seems to be really lean-specific. The moral equivalent goes through in Coq no problem because we can unfold a fixpoint definition. Here goes:

import tactic

variables α : Type*

inductive color
| red
| black

inductive node (α)
| leaf {} : node
| tree {} (color : color) (left : node) (val : α) (right : node) : node

def balance {α} : color  node α  α  node α  node α
| color.black (node.tree color.red (node.tree color.red a x b) y c) z d :=
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
| color.black (node.tree color.red a x (node.tree color.red b y c)) z d :=
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
| color.black a x (node.tree color.red (node.tree color.red b y c) z d) :=
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
| color.black a x (node.tree color.red b y (node.tree color.red c z d)) :=
     node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
| color a x b := node.tree color a x b

#print prefix balance.equations

example :  a (x : α) b y c z d, balance color.black (node.tree color.red (node.tree color.red a x b) y c) z d =
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d) :=
begin
    -- intros, refl, -- should work
    rintros a x b y (c|⟨cc|cc, cl, cv, cr) z (d|⟨dc|dc, dl|⟨dlc|dlc, dll, dlv, dlr, dv, dr|⟨drc|drc, drl, drv, drr⟩⟩),
    all_goals {refl}
end

I've found a very concise proof, but it's deeply unsatisfying. First notice, that the statement of the lemma is literally the pattern match, so it really should just go through with refl. In fact nothing in the context or anywhere else gives us a hint at all what to do next.

I haven't found any tactic to make any progress, unfold doesn't say anything. And the fact, that I have to destruct dl and dr is because of lines 3 and 4 of the pattern match. If we erase them, we don't have to do that much distinction by cases. In total the equation compiler creates 122 equations.

The root cause seems to be that the equation compiler is introducing cases which are meaningless in the context.

What it conceptually should do, is go over the first case in the pattern match, see, that the first pattern is color.black, so we do have to destruct the first argument. Keep that fact around, go to the next argument, see that it needs a pattern on node because we encountered node.tree. Next we see that color needs to be split again, etc.

If the compiler did a backtracking search during the generation of cases, we'd end up with quite a bit fewer cases and the lemmas would be more useful, because they'd quantify over all c and d in this particular case, fixing refl.

Now if the compiler can't do that / be told to do that / can't be fixed, the next best thing would be a tactic, that lets us destruct based on the concrete lambda.

If unfold worked in this case, we should get something like [a lot of the body is gone, because we're in a deeply nested case...] (fun x, x.cases_on ...). At that point, we could make a tactic (I'm going to call it unfold_cases), that takes the type of x, and destructs it. As a consequence we'd get a few new goals.

With this tactic, we could write the proof as this, I think: intros, repeat { refl <|> unfold_cases }.

If we want to avoid exploding the whole body, we could always do the distinction by cases and then elide the body.

I'd be happy to have a go at writing such a tactic, but I don't even know how to unfold the definition. I can't find what to pass as the config to unfold; delta, dsimp and simp all do nothing.

Any help would be very, very welcome. I've been struggling with this example for a year now on and off and I'm determined to make it work is some nice way.

And I'm pretty sure this is not a fluke, but rather something that's quite deeply linked to the distinction between Coq and lean so it'll come in handy in pretty much all cases of non-trivial pattern match expressions. It may not come up in maths that much but comes up tons in reasoning about programs.

Simon Hudon (Jul 04 2020 at 15:39):

Yeah, I see it's pretty hairy. Writing a good equation compiler is quite a difficult task. I think a tactic like you describe that can use details of the specification of the function would be very useful indeed

Simon Hudon (Jul 04 2020 at 15:41):

It's now possible to roll our own syntax for functions and call the built-in equation compiler. I could imagine a way of defining equations so that, we can at least prove the branches used in the specification of the function as lemmas / additional equations, even if it doesn't match the definition exactly

Daniel Fabian (Jul 04 2020 at 15:43):

ya, adding the lemmas for the cases would be fine. They should be easy to construct proofs for too, as they are really just distinction by cases, refl (by definition).

Daniel Fabian (Jul 04 2020 at 15:46):

is there any kind of docu / examples where I could learn something about the API? I struggle to find docu.

Simon Hudon (Jul 04 2020 at 16:04):

Here is a good tutorial on the writing of automation https://github.com/leanprover-community/mathlib/blob/master/docs/extras/tactic_writing.md. Looking at the code of tactic.core in mathlib should provide you with a lot of good examples

Simon Hudon (Jul 04 2020 at 16:09):

Oh and here is a guide for the available tactics https://leanprover-community.github.io/mathlib_docs/tactics.html

Daniel Fabian (Jul 04 2020 at 16:13):

Ya, I've studied both before writing here. At this point, I think my next step would be getting the AST of the goal that I need. So I could inspect it and figure out what to cases on.

Simon Hudon (Jul 04 2020 at 16:17):

That sounds like a good exercise. Please keep me posted, maybe I can give you a couple of nudges

Daniel Fabian (Jul 04 2020 at 16:18):

k will do

Daniel Fabian (Jul 04 2020 at 18:30):

alright, this is not very useful. When I unfold the application a few times, I get to a "constant" named "balance".

Daniel Fabian (Jul 04 2020 at 18:30):

I don't get its body.

Mario Carneiro (Jul 04 2020 at 19:12):

Don't try to do too much in one definition. For one thing, it's easier to have separate balance_l and balance_r functions rather than doing both at once. Plus, it's impossible for all of these case splits to be true by refl, even if the equation compiler didn't do some superfluous case splits, because they overlap

Mario Carneiro (Jul 04 2020 at 19:13):

Additionally, there will be far fewer cases if you consider only well formed red black trees

Daniel Fabian (Jul 04 2020 at 19:15):

With valid RB trees, you mean having a type family that does things like enforcing the balance or red-black invariant by construction, right? Then it's becomes a GADT and we can prove much more about it during construction.

Daniel Fabian (Jul 04 2020 at 19:16):

doing that ends up with a more painful insertion algorithm, though. In particular it becomes a mutually recursive insertion.

Mario Carneiro (Jul 04 2020 at 19:18):

I wrote this once a long long time ago. Here's how I did it:

inductive rb_node (D : Type) : bool  nat  Type
| nil : rb_node ff 0
| red : D  Π{n} (l r : rb_node ff n), rb_node tt n
| black : D  Π{n a}, rb_node a n  rb_node ff n  rb_node ff (nat.succ n)

namespace rb_node
  def blackR {D} (d : D) {n a} (l : rb_node D ff n) (r : rb_node D a n) : rb_node D ff (n+1) :=
  match n, a, l, r with
  | _, ff, l, r := black d l r
  | _, tt, x, red d' y z := black d' (red d x y) z
  end

  section
  -- An rb_node whose invariant has been broken. The bool,nat type parameters refer to the
  -- original type of the node before being broken.
  inductive ins_broken (D : Type) (n : nat) : bool  Type
  | fixed : Π{b}, rb_node D b n  ins_broken b
  | redden : rb_node D tt n  ins_broken ff
  | redL : D  Πb, rb_node D tt n  rb_node D b n  ins_broken tt
  | redR : D  Πb, rb_node D b n  rb_node D tt n  ins_broken tt

  open ins_broken
  def insert2 {D} [has_lt D] [decidable_rel ((<) : D  D  Prop)] (dat : D) (replace : bool) :
    Π {a n} (t : rb_node D a n), ins_broken D n a
  | ff 0 (nil ._) := redden (red dat (nil D) (nil D))
  | tt n (red d l r) := match cmp dat d with
    | ordering.lt := match insert2 l with
      | fixed c := fixed (red d c r)
      | redden c := redL d _ c r
      end
    | ordering.eq := fixed (red (cond replace dat d) l r)
    | ordering.gt := match insert2 r with
      | fixed c := fixed (red d l c)
      | redden c := redR d _ l c
      end
    end
  | ff (n+1) (@black ._ d ._ a l r) := match cmp dat d with
    | ordering.lt := match a, n, insert2 l, r with
      | a, _, fixed l, r := fixed (black d l r)
      | ff, _, redden l, r := fixed (black d l r)
      | tt, _, redL d₂ b (red d₁ x y) z, w := redden (red d₂ (black d₁ x y) (black d z w))
      | tt, _, redR d₁ b x (red d₂ y z), w := redden (red d₂ (black d₁ x y) (black d z w))
      end
    | ordering.eq := fixed (black (cond replace dat d) l r)
    | ordering.gt := match a, n, l, insert2 r with
      | _, _, l, fixed r := fixed (black d l r)
      | ff, _, l, redden r := fixed (blackR d l r)
      | tt, _, red d₁ x y, redden (red d₂ z w) := redden (red d (black d₁ x y) (black d₂ z w))
      end
    end
  end

end rb_node

Daniel Fabian (Jul 04 2020 at 19:19):

And what do you mean by overlapping, btw? Pattern matches are disjoint by definition, no? The second case only applies, if the first one didn't, I.e. we could even have a new information in the local context, saying that it is not compatible with the any of the previous cases.

Mario Carneiro (Jul 04 2020 at 19:20):

Maybe that's the way you write the branches of the definition, but in the equations they all have to stand alone. That's why there are so many case splits

Mario Carneiro (Jul 04 2020 at 19:21):

def foo :     
| 0 n := 0
| m 0 := 1
| m n := 2

Obviously we can't have the equation foo.equations_1 : foo m 0 = 1 here

Mario Carneiro (Jul 04 2020 at 19:21):

so lean has to case split it and you get foo (m+1) 0 = 1 instead

Mario Carneiro (Jul 04 2020 at 19:23):

Unlike Coq's Program Definition, lean does not have definitions where the later branches come with preconditions saying that the earlier branches don't match

Daniel Fabian (Jul 04 2020 at 19:25):

Ah... that's different from my mental model coming from F#. I always think of them as the next branch having a proof of the previous not having matched.

Mario Carneiro (Jul 04 2020 at 19:25):

the only way lean can express that property is by case splitting until it's true by disjointness of constructors

Daniel Fabian (Jul 04 2020 at 19:26):

something like forall m, (forall n, (0, n) <> (m, 0)) -> rest wouldn't work?

Mario Carneiro (Jul 04 2020 at 19:27):

Which means that when you have a definition like

def foo : node -> A
| (node.tree color.red (node.tree color.red a x b) y c) := ...
| _ := ...

it generates a ton of case splits. If you do it twice it generates a square ton

Mario Carneiro (Jul 04 2020 at 19:27):

I'm just saying that this isn't how the equation compiler works

Daniel Fabian (Jul 04 2020 at 19:27):

ok fair enough.

Mario Carneiro (Jul 04 2020 at 19:28):

I can imagine a variation on the equation compiler that generates those, although it might be difficult to work with

Daniel Fabian (Jul 04 2020 at 19:28):

does that mean I'm just simply restricted how I have to write the definition, or could that in general be solved using tactics?

Mario Carneiro (Jul 04 2020 at 19:28):

One thing you can do is to split the definition into smaller parts

Daniel Fabian (Jul 04 2020 at 19:28):

it's a bit like ite vs. dite, isn't it?

Mario Carneiro (Jul 04 2020 at 19:29):

Done properly, it can eliminate the superfluous case splits (so you can get the proofs by refl back when possible)

Mario Carneiro (Jul 04 2020 at 19:30):

You can also very precisely specify the case split tree using by using induction and cases to write the definition

Daniel Fabian (Jul 04 2020 at 19:30):

right, sure, but then I do it by hand, right?

Daniel Fabian (Jul 04 2020 at 19:31):

as in during the proof, right?

Mario Carneiro (Jul 04 2020 at 19:31):

you do the same case splits in the proof

Mario Carneiro (Jul 04 2020 at 19:31):

you have to do that anyway

Daniel Fabian (Jul 04 2020 at 19:33):

I wonder if it could be written almost like a relation, with 5 separate constructors.

Daniel Fabian (Jul 04 2020 at 19:33):

then you'd only have 5 cases

Mario Carneiro (Jul 04 2020 at 19:33):

Indeed, I recommend it

Mario Carneiro (Jul 04 2020 at 19:33):

That's what ins_broken is accomplishing in my code snippet

Daniel Fabian (Jul 04 2020 at 19:35):

Do we have any control about how the equation compiler does its work? perhaps some attributes or something?

Daniel Fabian (Jul 04 2020 at 19:35):

maybe help it out a little?

Mario Carneiro (Jul 04 2020 at 19:35):

not really

Mario Carneiro (Jul 04 2020 at 19:35):

You can help it out by writing the equations in a particular way

Daniel Fabian (Jul 04 2020 at 19:36):

is there a white paper that describes the algo, so I could know what is and what isn't good?

Mario Carneiro (Jul 04 2020 at 19:37):

I don't think so, but equation compilation has a literature from systems like Agda already

Daniel Fabian (Jul 04 2020 at 19:38):

ah, ok, so I'll have to look it up there.

Mario Carneiro (Jul 04 2020 at 19:39):

Generally speaking, superfluous case splits happen in examples like foo : ℕ → ℕ → ℕ above, which will probably have 4 cases instead of 3 because it decides which things need splitting at the top level and then splits all of them

Daniel Fabian (Jul 04 2020 at 19:39):

I do have to say, I was really hoping to write down the definition as naturally as possible and use tactics to prove the cases. Having to change the definition so the compiler is happier, is a bit disappointing.

Mario Carneiro (Jul 04 2020 at 19:40):

You can still do that, if you don't care about the 100 cases thing

Mario Carneiro (Jul 04 2020 at 19:40):

you should prove the equations you want regardless

Daniel Fabian (Jul 04 2020 at 19:40):

I really don't care too much about the cases, but unfortunately unfold doesn't help.

Daniel Fabian (Jul 04 2020 at 19:40):

so I literally don't see what cases need to be considered.

Daniel Fabian (Jul 04 2020 at 19:41):

even a tactic that would do something like "here's a definition, you introduce all the cases as per definition and create goals, please" would pretty much fix the problem.

Daniel Fabian (Jul 04 2020 at 19:42):

it's that I can't see what cases are missing that makes is to hard to work with.

Mario Carneiro (Jul 04 2020 at 19:42):

I think we could have a split_unfold foo tactic that attempts to unfold foo and do case splits on any variables that block unfolding

Daniel Fabian (Jul 04 2020 at 19:43):

that would be amazing, I tried doing something like that, but so far not too much luck.

Daniel Fabian (Jul 04 2020 at 19:43):

I can unfold the goal a bit, but then I just get a constant "balance".

Daniel Fabian (Jul 04 2020 at 19:43):

not the lambda that it corresponds to.

Mario Carneiro (Jul 04 2020 at 19:44):

It doesn't correspond to a lambda that would be helpful for you anyway

Mario Carneiro (Jul 04 2020 at 19:45):

I guess in Coq matches get compiled to primitive matches so this kind of display makes sense?

Daniel Fabian (Jul 04 2020 at 19:47):

it does this:

to_set l x \/ x = v \/ to_set r x <->
to_set
  match c with
  | red => tree c l v r
  | black =>
      match l with
      | leaf =>
          match r with
          | leaf => tree c l v r
          | tree color0 b y d =>
              match color0 with
              | red =>
                  match b with
                  | leaf => match d with
                            | leaf => tree c l v r
                            | tree color1 c0 z d0 => match color1 with
                                                     | red => tree red (tree black l v b) y (tree black c0 z d0)
                                                     | black => tree c l v r
                                                     end
                            end
                  | tree color1 b0 y0 c0 =>
                      match color1 with
                      | red => tree red (tree black l v b0) y0 (tree black c0 y d)
                      | black => match d with
                                 | leaf => tree c l v r
                                 | tree color2 c1 z d0 => match color2 with
                                                          | red => tree red (tree black l v b) y (tree black c1 z d0)
                                                          | black => tree c l v r
                                                          end
                                 end
                      end
                  end
              | black => tree c l v r
              end
          end
      | tree color0 a x0 c0 =>
          match color0 with
          | red =>
      (and more)
              ```

Daniel Fabian (Jul 04 2020 at 19:47):

so you immediately see what do and you can make a tactic destructing the top match expr.

Daniel Fabian (Jul 04 2020 at 19:48):

my attempt so far was this:

meta def find_matching (e : expr) : list name  tactic expr
| [] := tactic.failed
| (H::Hs) := do
    t  tactic.resolve_name H,
    t  tactic.to_expr t,
    t  tactic.infer_type t,
    tactic.unify e t tactic.transparency.all >> return t

meta def trace_goal : tactic unit :=
do  goal  tactic.target,
    names  tactic.get_eqn_lemmas_for ff `balance,
    find_matching goal names >>= tactic.trace

Daniel Fabian (Jul 04 2020 at 19:49):

It finds the lemmas. And I was hoping to instantiate each one of them like apply and if it doesn't unify with the current context, kill off the lemma for the current path.

Mario Carneiro (Jul 04 2020 at 19:49):

If you delta balance balance._main you can see something pretty similar

Daniel Fabian (Jul 04 2020 at 19:49):

so you'd get fewer and fewer lemmas work.

Daniel Fabian (Jul 04 2020 at 19:50):

ya, looking good, now we'd need to make the root thing go away.

Mario Carneiro (Jul 04 2020 at 19:51):

root thing?

Daniel Fabian (Jul 04 2020 at 19:51):

because color.black.cases_on should become just one branch, right?

Mario Carneiro (Jul 04 2020 at 19:51):

dsimp

Daniel Fabian (Jul 04 2020 at 19:53):

I see. so now after dsimp it I think looks like this: node.rec (blablabla) c. so the tactic would now have to go and split c, correct?

Mario Carneiro (Jul 04 2020 at 19:55):

yea

Mario Carneiro (Jul 04 2020 at 19:57):

that's probably the best way to do it: first delta then dsimp until you get stuck, then call cases to split on all stuck recursors; this generates the split tree, and then you can restart from the beginning with the split tree and then use regular rw to apply the equation lemmas

Mario Carneiro (Jul 04 2020 at 19:58):

it would be nice if the equation compiler reported the split tree though, that would save a lot of trouble

Daniel Fabian (Jul 04 2020 at 20:01):

I've got this beautiful pattern:

example :  a (x : α) b y c z d, balance color.black (node.tree color.red (node.tree color.red a x b) y c) z d =
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d) :=
begin
    -- intros, refl, -- should work
    -- rintros a x b y (c|⟨cc|cc, cl, cv, cr⟩) z (d|⟨dc|dc, dl|⟨dlc|dlc, dll, dlv, dlr⟩, dv, dr|⟨drc|drc, drl, drv, drr⟩⟩),
    -- all_goals {refl}
    intros,
    delta balance balance._main,
    dsimp,
    cases c; refl <|> dsimp,
    cases d; refl <|> dsimp,
    cases d_color; refl <|> dsimp,
    cases d_left; refl <|> dsimp,
    cases d_right; refl <|> dsimp,
    cases d_right_color; refl <|> dsimp,
    cases d_left_color; refl <|> dsimp,
    cases d_right; refl <|> dsimp,
    cases d_right_color; refl <|> dsimp,
    cases d_right; refl <|> dsimp,
    cases d_right_color; refl <|> dsimp,
    cases c_color; refl <|> dsimp,
    cases d; refl <|> dsimp,
    cases d_color; refl <|> dsimp,
end

Daniel Fabian (Jul 04 2020 at 20:01):

very repetitive, but it's entirely obvious what to do next.

Daniel Fabian (Jul 04 2020 at 20:02):

and it's directly driven by the goal, no guessing.

Daniel Fabian (Jul 04 2020 at 20:03):

so I think, we could pattern match %%x.cases_on _ c and then cases c or something like that.

Simon Hudon (Jul 04 2020 at 20:04):

You can also try casesm. First you try refl, then casesm color, try refl again then try casesm on trees but without repetition

Mario Carneiro (Jul 04 2020 at 20:05):

the problem with that is that the type here is recursive, so you can end up splitting forever if you don't consult the definition

Daniel Fabian (Jul 04 2020 at 20:06):

I kind of like the idea of finding occurrences of rec_on or cases_on and unfold them.

Daniel Fabian (Jul 04 2020 at 20:07):

that should be fairly safe and make progress, because it shrinks the lambda.

Mario Carneiro (Jul 04 2020 at 20:08):

I would like the split_unfold tactic to not split rec terms that were manually written (not part of the equation compiler splitting) though

Daniel Fabian (Jul 04 2020 at 20:09):

no problem

Daniel Fabian (Jul 04 2020 at 20:09):

id_rhs is a special term added by the compiler.

Daniel Fabian (Jul 04 2020 at 20:09):

so we should only split on them.

Daniel Fabian (Jul 04 2020 at 20:09):

just an identity function for this very purpose; to drive case splits.

Mario Carneiro (Jul 04 2020 at 20:10):

Looking at the term, it looks like id_rhs protects individual branches of the definition

Mario Carneiro (Jul 04 2020 at 20:10):

I don't know if dsimp simplifies it

Daniel Fabian (Jul 04 2020 at 20:11):

so that really means, to find the tree, we could probably do a tree walk over the expr, collecting id_rhs

Daniel Fabian (Jul 04 2020 at 20:11):

we don't need to dsimp at all, I don't think.

Daniel Fabian (Jul 04 2020 at 20:12):

the ._main is normalized after all.

Daniel Fabian (Jul 04 2020 at 20:12):

so we could walk that definition directly to get the tree.

Daniel Fabian (Jul 04 2020 at 20:13):

and dsimp seems to keep it in place btw.

Daniel Fabian (Jul 04 2020 at 20:13):

I run dsimp at every step and still see it. It goes away once the rec and cases_on are invoked.

Mario Carneiro (Jul 04 2020 at 20:14):

oh, that sounds more like it's not being touched because it is in a dependent position

Daniel Fabian (Jul 04 2020 at 20:18):

so I finished the proof using this weird script above, it's very mechanical and super fast it checks in like 200ms.

Simon Hudon (Jul 04 2020 at 20:26):

Cool! Do you want to PR it?

Daniel Fabian (Jul 04 2020 at 20:27):

Well, I don't have a tactic for it. Just verified that this mechanical thing works.

Daniel Fabian (Jul 04 2020 at 20:27):

At the moment, though each line still has a different identifier.

Daniel Fabian (Jul 04 2020 at 20:27):

I wouldn't mind a bit of help with the tactic, but then, yes sure, I'll make a PR.

Mario Carneiro (Jul 04 2020 at 20:28):

see if you can derive the name from the goal, so that your tactic is repeated verbatim n times

Mario Carneiro (Jul 04 2020 at 20:28):

even if it's something hacky like ~ last (get_app_args target)

Daniel Fabian (Jul 04 2020 at 20:29):

yep, I think it's a lambda application each time.

Daniel Fabian (Jul 04 2020 at 20:29):

so if I pattern match the expr, that should work.

Simon Hudon (Jul 04 2020 at 20:30):

Indeed

Simon Hudon (Jul 04 2020 at 20:33):

@Mario Carneiro consider using app_arg instead of last (get_app_args _), it's faster and simpler

Mario Carneiro (Jul 04 2020 at 20:34):

The ~ is because that's totally made up

Mario Carneiro (Jul 04 2020 at 20:35):

where's a handwavy emoji when you need one

Daniel Fabian (Jul 04 2020 at 20:47):

hehe

Daniel Fabian (Jul 04 2020 at 20:51):

horrible, unelegant, not pretty and all, but effective:

meta def get_arg : expr  tactic expr
| (expr.app x y) := pure y
| _ := tactic.failed

meta def split_unfold :=
do  `(%%l = %%r)  tactic.target,
    head  get_arg l,
    tactic.cases head



example :  a (x : α) b y c z d, balance color.black (node.tree color.red (node.tree color.red a x b) y c) z d =
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d) :=
begin
    -- intros, refl, -- should work
    -- rintros a x b y (c|⟨cc|cc, cl, cv, cr⟩) z (d|⟨dc|dc, dl|⟨dlc|dlc, dll, dlv, dlr⟩, dv, dr|⟨drc|drc, drl, drv, drr⟩⟩),
    -- all_goals {refl}
    intros,
    delta balance,
    dsimp,
    repeat {split_unfold, all_goals {refl <|> dsimp <|> skip}},
end

Mario Carneiro (Jul 04 2020 at 20:57):

get_arg should already exist

Mario Carneiro (Jul 04 2020 at 20:57):

src#expr.app_arg

Daniel Fabian (Jul 04 2020 at 20:58):

the kind of experience, I'd want would be something like you have the normal local context, where balance is kept as a constant, nothing fancy. Then you'd call unfold_split balance with lc ll lv lr. It would do the delta balance, dsimp internally to find the next arg. Then throw away the state, and simply do the split on said identifier.

Mario Carneiro (Jul 04 2020 at 20:59):

I was thinking that split_unfold would split as many times as necessary to apply unfold balance once in every subgoal

Daniel Fabian (Jul 04 2020 at 20:59):

ya, could do that too.

Daniel Fabian (Jul 04 2020 at 21:00):

then we need the full tree.

Daniel Fabian (Jul 04 2020 at 21:00):

either way, it'd be nice, if we don't do the delta in the user-visible state.

Mario Carneiro (Jul 04 2020 at 21:00):

yeah, it would do your repeat {split_unfold, all_goals ...} loop internally

Daniel Fabian (Jul 04 2020 at 21:00):

yep.

Daniel Fabian (Jul 04 2020 at 21:22):

actually, I don't think we should be unfolding the apps like this... The equational lemmas tell us exactly what cases are possible.

Daniel Fabian (Jul 04 2020 at 21:23):

If we go and unfold apps wildly, we may end up unfolding inside of user code which is a bad idea.

Daniel Fabian (Jul 04 2020 at 21:23):

so we should really use the definitional lemmas to infer the split tree. and then use the local context to decide which ones are missing

Daniel Fabian (Jul 04 2020 at 21:24):

I'm pretty sure that's safer, don't you think?

Simon Hudon (Jul 04 2020 at 21:26):

I don't think using the equational lemmas is going to be easy. In comparison, doing case for the calls to rec in the user code should be a small enough issue. If you use match in your code, the tactic won't see the rec calls so you really have to invoke them directly. And when you do, you can have an option to say "do one less than you otherwise would"

Simon Hudon (Jul 04 2020 at 21:28):

One thing you can use the equation tree for is check if you have done enough cases. If you can find an equation that matches your current state, you should stop going into more case analysis.

Daniel Fabian (Jul 04 2020 at 21:29):

ah, the argument being, that we FIRST unfold the arguments and only then things in the body.

Daniel Fabian (Jul 04 2020 at 21:29):

so by noticing, we've done enough cases, we can use that as the exit criterion.

Simon Hudon (Jul 04 2020 at 21:29):

exactly

Daniel Fabian (Jul 04 2020 at 21:30):

in that case the raw delta unfolding is not really a problem, because we will consume the whole definition.

Simon Hudon (Jul 04 2020 at 21:31):

exactly

Simon Hudon (Jul 04 2020 at 21:34):

I could see using split_unfold balance as one would use cases. We can tag the resulting goals and do a proof by case or do split_unfold balance; finish (with finish or other tactic)

Daniel Fabian (Jul 04 2020 at 21:37):

I don't think so.

Daniel Fabian (Jul 04 2020 at 21:37):

In case you DO want to tag, rcases does the trick.

Daniel Fabian (Jul 04 2020 at 21:37):

split_unfold is specifically about not having to do that.

Daniel Fabian (Jul 04 2020 at 21:38):

rcases already does the cartesian explosion for you.

Daniel Fabian (Jul 04 2020 at 21:39):

but I really can't see how you can use inference and have to provide explicit identifiers at the same time.

Simon Hudon (Jul 04 2020 at 21:40):

Right but it does it with a pattern that you provide it. split_unfold detects that pattern. If you don't want to do the proof by case you don't have to. If you do, it's easy to enable

Daniel Fabian (Jul 04 2020 at 21:41):

But what would be the identifiers you provide?

Simon Hudon (Jul 04 2020 at 21:41):

There's a pattern that you can follow:

cases xs,
case list.nil :
{ /- proof when xs is nil -/ },
case list.cons : x xs
{ /- proof when xs is cons -/ },

That's actually what I refer to as "tagging the goals"

Daniel Fabian (Jul 04 2020 at 21:41):

AH, that you mean.

Daniel Fabian (Jul 04 2020 at 21:41):

i thought you meant naming the locals.

Simon Hudon (Jul 04 2020 at 21:42):

No, like you say, for that rcases does a good job already.

Daniel Fabian (Jul 04 2020 at 21:43):

I'm really excited now, once we have this as a tactic, proving large cases will be much nicer :D

Simon Hudon (Jul 04 2020 at 21:44):

:) me too

Simon Hudon (Jul 04 2020 at 23:32):

Have you considered what you want to do if balance occurs more than once in the goal?

Daniel Fabian (Jul 04 2020 at 23:35):

Ya, I'm just walking around the house thinking about it.

Daniel Fabian (Jul 04 2020 at 23:36):

Let alone under quantifiers.

Daniel Fabian (Jul 04 2020 at 23:36):

I'm tempted to change the interface a bit

Daniel Fabian (Jul 04 2020 at 23:37):

how about you'd unfold_split balance color.black (node.tree col left v right)...

Daniel Fabian (Jul 04 2020 at 23:37):

I.e. rather than trying to do the cartesian explosion that'd be needed for fully solving the problem for multiple occurrences, we'd ask the user to give us an expression

Daniel Fabian (Jul 04 2020 at 23:38):

and we'd unfold to the point where the definition goes away in that case.

Simon Hudon (Jul 04 2020 at 23:45):

That's one way. You could also make up an attribute to tag balance with and generate an inductive relation for it, to encode the graph of the function. If I simplify, for definition f, you'd have the property that f x = y <-> r x y

Simon Hudon (Jul 04 2020 at 23:47):

Then, if you try to prove P (f x), you can replace it with ∀ y, r x y -> P y and then doing cases on the r x y gives you all the cases that you care about

Daniel Fabian (Jul 04 2020 at 23:48):

Ah, so rather than doing induction on x, we'd do induction over r, yes?

Simon Hudon (Jul 04 2020 at 23:48):

that's right

Daniel Fabian (Jul 04 2020 at 23:48):

ya, that looks really neat, actually.

Daniel Fabian (Jul 04 2020 at 23:49):

So how could I have the compiler spit out the definitions?

Simon Hudon (Jul 04 2020 at 23:51):

Go in tactic.ext, you'll see that, when you apply ext to a structure definition, it generates theorems. In your case, instead, you'd generate an inductive type

Daniel Fabian (Jul 04 2020 at 23:51):

Ok, I'll look, thx

Daniel Fabian (Jul 04 2020 at 23:59):

do I understand correctly that the ext attribute is only used inside of a call to the ext tactic and gets created on the fly? Or is there somehow a separate code-gen phase (didn't read the code in detail yet)

Daniel Fabian (Jul 05 2020 at 00:02):

oh, I see. It's the after_set of the user_attribute that generates the code.

Simon Hudon (Jul 05 2020 at 00:05):

That's right. You tag your structures and lemmas with ext. For structures, it creates an extensionality lemma and puts the ext attribute on it. For lemmas, it just puts them in a table of extensionality lemmas indexed by the type they apply to

Simon Hudon (Jul 05 2020 at 00:06):

The ext tactic reads that table and selects the right lemmas when needed

Daniel Fabian (Jul 05 2020 at 00:35):

wahoo, I can now generate a relation :). It's a dumb one, just added the identity function, but my attribute works :D

Daniel Fabian (Jul 05 2020 at 03:51):

I think the idea of generating additional lemma or the graph run into the same kind of problem. By the time we get access to the expr that corresponds to the definition of the function, the pattern match is lost. So we again have to reconstruct the decision procedure from the lambda. And the decision procedure is NOT what is written in the matter match, in fact that's the reason why refl doesn't work.

Simon Hudon (Jul 05 2020 at 04:11):

The usefulness of that lemma is not to do away with the machinery that we discussed but to separate two parts of the problem. When generating the lemmas, you have a simple, controlled proof state. That makes the case distinction logic less susceptible to breaking. When we use them, the proof state maybe be less clean but the lemmas should present a nice interface

Daniel Fabian (Jul 05 2020 at 04:13):

yes, that makes a lot of sense

Daniel Fabian (Jul 05 2020 at 04:13):

and I think I might a way how to reconstruct the "pattern match" cases from the lemmas.

Daniel Fabian (Jul 05 2020 at 04:14):

In one of my projects I designed a data structure that creates a set based on cartesian products, as in it detects when a particular subset is a cartesian product.

Daniel Fabian (Jul 05 2020 at 04:15):

this mechanism could be used to figure out what combinations of equations form a subset, that is a cartesian product

Daniel Fabian (Jul 05 2020 at 04:16):

and then when it is a cartesian product, you can universally quantify instead of the distinction by cases.

Simon Hudon (Jul 05 2020 at 04:16):

From the lemmas, I think that should be more straightforward than that.

Simon Hudon (Jul 05 2020 at 04:16):

Then, if you try to prove P (f x), you can replace it with ∀ y, r x y -> P y and then doing cases on the r x y gives you all the cases that you care about

Simon Hudon (Jul 05 2020 at 04:17):

when you have h : r x y, doing cases h should give you the perfect case distinction

Daniel Fabian (Jul 05 2020 at 04:18):

right, but what's the structure of r? just a single inductive with 122 cases?

Daniel Fabian (Jul 05 2020 at 04:19):

I thought we would want it to be compactified.

Simon Hudon (Jul 05 2020 at 04:19):

You can put that in a derived lemma of the shape:

@[elab_as_eliminator]
lemma balance_rec (h :  y, r x y -> P y) : P (f x) := ...

Simon Hudon (Jul 05 2020 at 04:19):

No, r would just have the few equations that you care about

Daniel Fabian (Jul 05 2020 at 04:20):

exactly, but the difficulty is to generate r from the definition.

Daniel Fabian (Jul 05 2020 at 04:20):

Or were you thinking to hand-write r?

Simon Hudon (Jul 05 2020 at 04:20):

No, I see that I overlooked a difficulty, you're right

Daniel Fabian (Jul 05 2020 at 04:21):

it boils down to finding a way how to compress the number of cases.

Simon Hudon (Jul 05 2020 at 04:21):

You'd still need to filter the cases.

Daniel Fabian (Jul 05 2020 at 04:21):

once we have that, the equation lemmas are fine

Daniel Fabian (Jul 05 2020 at 04:21):

in fact = is just such a relation after all

Simon Hudon (Jul 05 2020 at 04:22):

That's true (on both accounts)

Simon Hudon (Jul 05 2020 at 04:22):

When generating the relation, we can take care of the case filtering there

Daniel Fabian (Jul 05 2020 at 04:23):

right. Or we can generate some augmented helper cases for the equation lemmas.

Daniel Fabian (Jul 05 2020 at 04:23):

something like case_1.

Daniel Fabian (Jul 05 2020 at 04:23):

then you'd apply case_1.

Daniel Fabian (Jul 05 2020 at 04:23):

and it's pretty much exactly your forall y, r x y -> P y

Simon Hudon (Jul 05 2020 at 04:23):

I'd have to see it to judge

Daniel Fabian (Jul 05 2020 at 04:24):

first I'd need this mechanism of finding what case splits were unnecessary.

Simon Hudon (Jul 05 2020 at 04:25):

If you had access to the specification, that would be easier. Maybe we can add that to Lean's API

Daniel Fabian (Jul 05 2020 at 04:25):

ya, having the spec would be a lot better, indeed.

Daniel Fabian (Jul 05 2020 at 04:26):

plus it'd give a canonical way for building r.

Daniel Fabian (Jul 05 2020 at 04:26):

And we could make r even better like this:

Daniel Fabian (Jul 05 2020 at 04:27):

inductive r
| case_1 : T -> r
| case_2 : ¬T -> U -> r
| case_3 : ¬T -> ¬U -> V -> r

Daniel Fabian (Jul 05 2020 at 04:28):

this kind of thing could make proofs potentially easier in some cases.

Daniel Fabian (Jul 05 2020 at 04:29):

who parses the spec, btw? Is that written in c++ or lean?

Simon Hudon (Jul 05 2020 at 04:33):

That's written in C++. That part of the code is fresh in my mind. I've been playing with the equation compiler in the last couple of months. It might be good to see if we can make the feature without this addition though

Daniel Fabian (Jul 05 2020 at 04:39):

I could port my product set, potentially.

Daniel Fabian (Jul 05 2020 at 04:40):

That thing would at least figure out what pairs of variables form cartesian products.

Daniel Fabian (Jul 05 2020 at 04:40):

or maybe at least the core of that inference.

Daniel Fabian (Jul 05 2020 at 04:41):

The rule is, that a union of two cartesian products is a cartesian product iff all but one dimensions agree.

Daniel Fabian (Jul 05 2020 at 04:42):

so if each variable is a dimension, and each constructor a point, then you can think of each case as a bunch of points in an n-dimensional space.

Daniel Fabian (Jul 05 2020 at 04:43):

and this inference allows you to merge things into bigger and bigger "blocks" which translates into more and more coarse patterns.

Simon Hudon (Jul 05 2020 at 04:48):

That could be useful. You might want to start with more basic stuff and build up to it

Daniel Fabian (Jul 05 2020 at 04:50):

Well, the product set is really just that. A representation of a set. But unlike a hash set or a binary search tree, is has a notion of dense and sparse and uses this to compress the representation.

Daniel Fabian (Jul 05 2020 at 04:51):

you could think of it a bit like n-dimensional run-length encoding.

Daniel Fabian (Jul 05 2020 at 04:51):

but it can be used for this kind of task where you need to figure out if you are "dense" over a region. In fact that was the very purpose we built it in the other project.

Daniel Fabian (Jul 05 2020 at 04:52):

As for making a simpler cases, I really know what, to be honest.

Daniel Fabian (Jul 05 2020 at 04:52):

My experience with lean is overall really, really nice.

Daniel Fabian (Jul 05 2020 at 04:53):

It's just that it completely falls apart in case of these large-number case-splits.

Daniel Fabian (Jul 05 2020 at 04:53):

So working on fixing that is extremely enticing, tbh.

Simon Hudon (Jul 05 2020 at 04:54):

Excellent :) It sounds like we'll get some nice automation to work with

Daniel Fabian (Jul 05 2020 at 04:55):

let me port the product set then. It's gonna take me a few days, but it's a purely functional persistent datastructure.

Daniel Fabian (Jul 05 2020 at 04:55):

so it shouldn't be too bad.

Simon Hudon (Jul 05 2020 at 04:55):

For a more verification condition style of verification (a la F*), I built something that could be considered a further extension. It creates lemma stubs for you

Daniel Fabian (Jul 05 2020 at 04:55):

My original version is in typescript. So lean should have no trouble representing it.

Daniel Fabian (Jul 05 2020 at 04:56):

and I was actually making a point to design it very flexibly. Thinking as inductive branches as exactly the kind of thing we might want.

Simon Hudon (Jul 05 2020 at 04:57):

Cool

Daniel Fabian (Jul 05 2020 at 04:57):

I'll be in touch ;-)

Simon Hudon (Jul 05 2020 at 04:58):

Please do

Daniel Fabian (Jul 05 2020 at 04:58):

I actually have a formal proof of the cartesian product properties in lean (was an exercise when I was learning)

Daniel Fabian (Jul 06 2020 at 01:05):

I've got the first layer coded up. This is the bsp set. The idea is, that you provide a way of breaking the space into disjoint regions (however you want, e.g. one constructor at a time, or something like that). And then the bsp set lets you do efficient set operations like union, intersection, set difference and complement.

Daniel Fabian (Jul 06 2020 at 01:05):

https://gist.github.com/DanielFabian/9e1013582f7ce6e5563f204708a82e5d

Daniel Fabian (Jul 06 2020 at 01:06):

the next layer would be the product set. That's a multidimensional thing that lets us find these regions that form cartesian products. and they'd allow us to find an efficient handling of all the cases.

Daniel Fabian (Jul 06 2020 at 01:07):

One challenge I see, though, is that even once we have these regions, unification of types is not really straightforward...

Daniel Fabian (Jul 06 2020 at 01:12):

consider a case like this:

 (left : sparse_node), combine_children (sparse left) empty = sparse (left.left_sparse empty)
 (left : sparse_node), combine_children (sparse left) dense = sparse (left.left_sparse dense)
 (left root : sparse_node), combine_children (sparse left) (sparse root) = sparse (left.left_sparse (sparse root))

these three right-hand sides are supposed to be the same, modulo the case split empty | dense | sparse root. And we have to somehow reconstruct that based solely on the right-hand sides. But the problem is that the right-hand sides have all cases mixed together, so if there were multiple occurrences of empty, say, how would we know which one to turn into the forall case?

Daniel Fabian (Jul 06 2020 at 01:13):

this here, again suggests that we might be better off having access to the spec, somehow.

Simon Hudon (Jul 06 2020 at 01:30):

I think this should not be your first focus. Using a more naive data structure first should help you understand the API better . It might even be that a naive data structure could provide sufficient performances. The size of the data structures is often not very big.

Daniel Fabian (Jul 06 2020 at 01:32):

One problem I see, is that the API I saw (expressions, declarations, theorems, etc.) are all essentially just lambda expressions.

Daniel Fabian (Jul 06 2020 at 01:32):

but the high-level intent is lost.

Daniel Fabian (Jul 06 2020 at 01:33):

It's almost like trying to reconstruct the meaning of a program from assembly

Simon Hudon (Jul 06 2020 at 01:36):

You'd be surprised how much can be done with nothing but terms. That's because you can restrict the shape of the terms that you consider

Daniel Fabian (Jul 06 2020 at 01:39):

I do see that cases_on could be potentially used to reconstruct pattern matches.

Simon Hudon (Jul 06 2020 at 01:41):

With cases_on or rec_on, you can verify that it's a recursor, you can check how many parameters and indices the type has and then select which argument is the subject

Daniel Fabian (Jul 06 2020 at 01:42):

sure, we get the arity from cases_on

Daniel Fabian (Jul 06 2020 at 01:42):

so that's the destructed things in the pattern match.

Daniel Fabian (Jul 06 2020 at 01:42):

but how do you unify the resulting terms?

Daniel Fabian (Jul 06 2020 at 01:42):

when they are clearly not the same.

Daniel Fabian (Jul 06 2020 at 01:44):

even the most trivial case is not quite straight-forward: fun x, x for natural numbers could be something like nat.cases_on zero (fun x, succ x)

Simon Hudon (Jul 06 2020 at 01:45):

Consider this type of sum.cases_on : Π α β (C : sum α β → Sort u), sum α β → (Π x, C (sum.inl x)) → (Π x, C (sum.inr x)) : Π x : sum α β, C x, you see that the first three arguments are not the branches and not the subject. That's why you need to use the API to get more meta information

Simon Hudon (Jul 06 2020 at 01:45):

Do you have a concrete example of where the problem shows up?

Daniel Fabian (Jul 06 2020 at 01:47):

ya, sure. e.g. the generated functions for my own pattern match functions I wrote for the bsp set.

Daniel Fabian (Jul 06 2020 at 01:47):

the three quoted lines above.

Daniel Fabian (Jul 06 2020 at 01:47):

they all correspond to sparse x

Daniel Fabian (Jul 06 2020 at 01:49):

or were you talking about the fact, that not all arguments show up?

Simon Hudon (Jul 06 2020 at 01:49):

That's what I was referring to.

Simon Hudon (Jul 06 2020 at 01:50):

For sparse node, is it supposed to be an example or is it a data structure for the tactic?

Daniel Fabian (Jul 06 2020 at 01:51):

Well, I ported the data structure, but the example is really showing the kind of thing, I don't really know how to address.

Daniel Fabian (Jul 06 2020 at 01:51):

Pretty much all pattern-match defined functions have these equation lemmas.

Daniel Fabian (Jul 06 2020 at 01:52):

and their rhs would very often be the same, module case split.

Daniel Fabian (Jul 06 2020 at 01:52):

and the tactic we want is meant to somehow unify the cases that are same.

Daniel Fabian (Jul 06 2020 at 01:52):

because that's how you get the number of cases down to the number of lines in the pattern match.

Daniel Fabian (Jul 06 2020 at 01:53):

after all the RB-tree example has something like 5 cases in the source code, but 122 equations.

Daniel Fabian (Jul 06 2020 at 01:53):

that means you have somthing like 30 equations each all coming from the same exact term in the source code.

Daniel Fabian (Jul 06 2020 at 01:53):

I.e. they are identical modulo the case split.

Daniel Fabian (Jul 06 2020 at 01:54):

but the substitution of terms is not an injective operation.

Daniel Fabian (Jul 06 2020 at 01:56):

for now I haven't even tried doing more complicated cases like dependent pattern matching or recursion, just the simplest possible cases of straight, non-recursive, non-dependent pattern matching.

Simon Hudon (Jul 06 2020 at 02:02):

Ok so the plan so far is:

  1. generate an inductive relation
  2. generate the equivalence lemma
  3. automate the application of the lemma.

Where would your issue become relevant?

Daniel Fabian (Jul 06 2020 at 02:05):

before step one. In order to create a relation that's somehow better than the 122 cases, we need a way to find out what cases are the same modulo the case split. And it is on those far fewer cases that we would want to create the inductive definition.

Daniel Fabian (Jul 06 2020 at 02:05):

but since I don't know the spec, the best I can do is try and reconstruct the spec from the already existing equation lemmas.

Daniel Fabian (Jul 06 2020 at 02:06):

on the bright side, the equation lemmas have a standard from: forall x, f x = y.

Daniel Fabian (Jul 06 2020 at 02:07):

but then they get split into the various cases that the compiler happened to have generated.

Daniel Fabian (Jul 06 2020 at 02:08):

so when there's a case split on x, then in y there'd be the substition from x to just the one constructor of the inductive type.

Daniel Fabian (Jul 06 2020 at 02:08):

so now you get something like this:

Simon Hudon (Jul 06 2020 at 02:08):

Ok, yes now I see

Daniel Fabian (Jul 06 2020 at 02:08):

def id x := x for nat turns into id zero = zero and forall n, id (succ n) = succ n.

Daniel Fabian (Jul 06 2020 at 02:09):

and it's really not straight-forward how to unify this.

Simon Hudon (Jul 06 2020 at 02:13):

Let's say you have two branches:

foo (C a b) (E c d) z := E0
foo x (D a b) z := E1

You start by comparing the patterns. The first argument is not broken up then same so you abstract it. You get:

foo x (E c d) z := E0 [C a b := x]
foo x (D a b) z := E1

And you continue until all the patterns are the same, introducing fresh variables when necessary.

foo x v z := E0 [C a b := x][E c d := v]
foo x v z := E1[D a b := v]

Then you can compare the rhs. They should be syntactically equal. Definitionally equal should suffice though.

Daniel Fabian (Jul 06 2020 at 02:16):

what if C a b occurs multiple times?

Daniel Fabian (Jul 06 2020 at 02:20):

I think if we have multiple occurrences we may end up with a term that's syntactically different from the original one, but at least it probably would be def equal.

Simon Hudon (Jul 06 2020 at 02:22):

The variables a and b will be distinct because they all can occur only once

Simon Hudon (Jul 06 2020 at 02:22):

(in the syntax of a pattern)

Daniel Fabian (Jul 06 2020 at 02:25):

I mean something like this: def two_times_plus_one x := 1 + x + x. so we get two_times_plus_one zero = succ zero + succ zero + succ zero and forall n, two_times_plus_one (succ n) = succ zero + succ n + succ n

Daniel Fabian (Jul 06 2020 at 02:25):

how do we tell it's actually the same case?

Daniel Fabian (Jul 06 2020 at 02:26):

I totally can see your procedure work nicely left to right.

Simon Hudon (Jul 06 2020 at 02:33):

I think it would work. If you find an occurrence of succ n, you know you must substitute it because n is a bound variable issue from the pattern succ n. That's the only way it can be introduced. We should even go further and, once succ n has been replaced, we should check that n is not still free in the right hand side

Daniel Fabian (Jul 06 2020 at 02:41):

in the second case, we'd introduce 1 + u + u (u being fresh). In the first case, we'd do v + v + v. Then we need to use show that forall n, u = succ n -> 1 + u + u and v = zero, v + v + v implies forall k, 1 + k + k.

Simon Hudon (Jul 06 2020 at 02:44):

You traverse the pattern two at a time so when you introduce a fresh variable in one, it gets introduced in the second as well. But if we didn't traverse them two by two, we could use de Bruijn variables to encode fresh variables. That way, we they are syntactically equal if they have to match

Daniel Fabian (Jul 06 2020 at 02:46):

ok, I'll give it a go. I think I understand the idea now.

Simon Hudon (Jul 06 2020 at 02:55):

Cool. Looking forward to seeing how it turns out

Daniel Fabian (Jul 06 2020 at 19:53):

type mismatch at application
  _eqn_1
term
  α
has type
  Type u_1 : Type (u_1+1)
but is expected to have type
  Sort u_1 : Type u_1
state:
α : Type u_1,
a : node α,
x : α,
b : node α,
y : α,
c : node α,
z : α,
d : node α,
_eqn_6 :
   {α : Sort u_1} (left : node α) (val : α) (right : node α) (x : α) (b : node α),
    balance._main color.black (node.tree color.black left val right) x b =
      node.tree color.black (node.tree color.black left val right) x b,
_eqn_5 :
   {α : Sort u_1} (left : node α) (val : α) (right : node α) (val_1 : α) (right_1 : node α) (x : α)
  (b : node α),
    balance._main color.black (node.tree color.red (node.tree color.black left val right) val_1 right_1) x b =
      node.tree color.black (node.tree color.red (node.tree color.black left val right) val_1 right_1) x b,
_eqn_4 :
   {α : Sort u_1} (a : node α) (x : α) (b : node α) (y : α) (c : node α) (z : α) (d : node α),
    balance._main color.black (node.tree color.red (node.tree color.red a x b) y c) z d =
      node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d),
_eqn_3 :
   {α : Sort u_1} (val : α) (right : node α) (x : α) (b : node α),
    balance._main color.black (node.tree color.red node.leaf val right) x b =
      node.tree color.black (node.tree color.red node.leaf val right) x b,
_eqn_2 :
   {α : Sort u_1} (x : α) (b : node α),
    balance._main color.black node.leaf x b = node.tree color.black node.leaf x b,
_eqn_1 :
   {α : Sort u_1} (a : node α) (x : α) (b : node α), balance._main color.red a x b = node.tree color.red a x b
 balance color.black (node.tree color.red (node.tree color.red a x b) y c) z d =
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)

Daniel Fabian (Jul 06 2020 at 19:54):

I've got a first thing, that just pulls in the lemmas into the current context, but I can't really manipulate it.

Daniel Fabian (Jul 06 2020 at 19:54):

Apprently I've got a mess with the universe levels, in particular I can not specialize by alpha.

Daniel Fabian (Jul 06 2020 at 19:56):

And because they are implicit args in the declaration, I don't know how to apply it in the tactic, I tried

``(@%%body alpha)
``(@(%%body) alpha)

Daniel Fabian (Jul 06 2020 at 19:58):

alternatively, if I skip the implicits and instead try to pass a node alpha, then it doesn't see meta vars. Presumably, because it expects the quantified type to live in a lower universe?

Daniel Fabian (Jul 06 2020 at 20:19):

Ok, I've managed to apply the type argument, but looks like it still isn't working. It just moved the universe level problem one level in. I think the root cause is, that the term alpha is a value in the next universe. But I need the sort, how do you do that?

Simon Hudon (Jul 06 2020 at 21:55):

Ok, Type u is the same as Sort (u+1). When you add _eqn_n, you can specify which universe you want to put them in. How do you create them?

Daniel Fabian (Jul 06 2020 at 21:58):

I do this:

meta def my_tac (n : parse ident) : tactic unit :=
do  lemmas  tactic.get_eqn_lemmas_for ff n,
    lemmas  mmap tactic.get_decl lemmas,
    h  get_local ``α,
    lemmas.for_each (λ lem, do
        body  tactic.instantiate_mvars lem.value,
        body  tactic.head_beta $ body.app h,
        «have» (some lem.to_name.components.ilast) none ``(%%body))

Daniel Fabian (Jul 06 2020 at 21:58):

using the beta reduction in the second but last line, it does specialize the alpha away, but I still can't use it, because the theorem lives presumably in a level too low.

Daniel Fabian (Jul 06 2020 at 21:59):

ideally, I'd add the theorems to the local context in a way, though, that I can just apply the local alpha.

Simon Hudon (Jul 06 2020 at 22:00):

Instead of have, you can use note. It takes elaborated terms

Simon Hudon (Jul 06 2020 at 22:01):

Also, instead of using lem.value (which gives you the proof term for the lemmas) use const n ls where n is the name of the lemma and ls is the list of universe parameters that you want to use

Daniel Fabian (Jul 06 2020 at 22:05):

is there a way to figure out what universe the current proof lives in? my type alpha gets a variable u_1 and I need to tie the lemmas to that.

Simon Hudon (Jul 06 2020 at 22:10):

The way universes work, you have universe literals (0, 1, 2, 3), universe parameters (u1, u2, u3) and
universe expressions (e.g. max u (v+1))

Daniel Fabian (Jul 06 2020 at 22:11):

right, that's the level inductive, right?

Simon Hudon (Jul 06 2020 at 22:11):

If alpha : Sort u, the type of alpha is an expression sort l where l is a universe level. You can use it to instantiate the universe parameters of a constant

Simon Hudon (Jul 06 2020 at 22:11):

That's right

Daniel Fabian (Jul 06 2020 at 22:12):

so basically if I get the constant alpha, I know this is meant to be a type, so I'd go ahead and put the theorem one level above or below.

Daniel Fabian (Jul 06 2020 at 22:17):

so hang on, this whole replacing bound variables business, would we do that by just building up more and more complex lambdas that eventually just solve the question by just simple reduction to normal form or are we meant to do reasoning outside and rewriting the terms by hand?

Simon Hudon (Jul 06 2020 at 22:22):

A lot of reasoning can be automated. You should read up on tactic.mk_mapp

Daniel Fabian (Jul 06 2020 at 22:25):

yes, sorry. I'm going through the code base learning the API and all.

Daniel Fabian (Jul 06 2020 at 22:26):

it's a non-trivial amount of API to take in :P

Daniel Fabian (Jul 06 2020 at 22:26):

and I think the automation I'm trying to implement is also fairly sophisticated.

Simon Hudon (Jul 06 2020 at 22:33):

For sure! One step at a time and don't be afraid to take on a much smaller problem first to get acquainted with the whole system

Daniel Fabian (Jul 06 2020 at 22:39):

this is a bit confusing... my local_const alpha : Type u shows up as const 1 [].

Daniel Fabian (Jul 06 2020 at 22:40):

i.e. it says it has not universe params.

Simon Hudon (Jul 06 2020 at 22:43):

What did you do to get there?

Daniel Fabian (Jul 06 2020 at 22:44):

it's this block:

    h  get_local ``α,

    lemmas.for_each (λ lem, do
        let bla := h.local_type,
        tactic.trace bla.to_raw_fmt,
        tactic.trace h.to_raw_fmt

Daniel Fabian (Jul 06 2020 at 22:45):

so h.local_type is const 1 []

Daniel Fabian (Jul 06 2020 at 22:46):

so presumably const 1 [] refers to Type u_1

Daniel Fabian (Jul 06 2020 at 22:47):

I'm trying to extract u_1 so that I can instantiate the theorem at u_1.

Daniel Fabian (Jul 06 2020 at 22:47):

but don't know how to get it from the local context.

Simon Hudon (Jul 06 2020 at 22:48):

const 1 [] is a sign to run. It's the internals leaking through the interface

Simon Hudon (Jul 06 2020 at 22:49):

Instead of local_type, use infer_type

Daniel Fabian (Jul 06 2020 at 22:49):

ok.

Daniel Fabian (Jul 06 2020 at 22:50):

woot, now I see "sort u+1"

Daniel Fabian (Jul 06 2020 at 22:50):

thx

Simon Hudon (Jul 06 2020 at 22:56):

:+1:

Daniel Fabian (Jul 06 2020 at 22:58):

yay, it worked, now I can use the lemmas in the local context.

Simon Hudon (Jul 06 2020 at 22:58):

Nice!

Daniel Fabian (Jul 07 2020 at 02:40):

Ok, so now I've managed to build up a large conjunction of all the cases, so we can go and try unifying things. Here's an example:

state:
α : Type ?,
new_lemma :
  ( {α : Type ?} (a : node α) (x : α) (b : node α), balance color.red a x b = node.tree color.red a x b) 
    ( {α : Type ?} (x : α) (b : node α),
         balance color.black node.leaf x b = node.tree color.black node.leaf x b) 
      ( {α : Type ?} (val : α) (right : node α) (x : α) (b : node α),
           balance color.black (node.tree color.red node.leaf val right) x b =
             node.tree color.black (node.tree color.red node.leaf val right) x b) 
        ( {α : Type ?} (a : node α) (x : α) (b : node α) (y : α) (c : node α) (z : α) (d : node α),
             balance color.black (node.tree color.red (node.tree color.red a x b) y c) z d =
               node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)) 
          ( {α : Type ?} (left : node α) (val : α) (right : node α) (val_1 : α) (right_1 : node α) (x : α)
             (b : node α),
               balance color.black (node.tree color.red (node.tree color.black left val right) val_1 right_1) x b =
                 node.tree color.black (node.tree color.red (node.tree color.black left val right) val_1 right_1) x
                   b) 
            ( {α : Type ?} (left : node α) (val : α) (right : node α) (x : α) (b : node α),
                 balance color.black (node.tree color.black left val right) x b =
                   node.tree color.black (node.tree color.black left val right) x b) 
              true
 false

Daniel Fabian (Jul 07 2020 at 02:45):

I still don't see how to unify the cases, though.

  • The universal quantifiers are quite different between the cases
  • And even if we manage to get the rhs to be syntactically (or at least definitionally) the same, it doesn't give us much, yet, because we still don't get the universal quantification.

Daniel Fabian (Jul 07 2020 at 02:49):

in order to prove something like

def id (x : nat) : nat := x + zero

we get lemmas id zero = zero + zero and forall n, id (succ n) = succ n + zero. So in order to get the forall n, id n = n, we have to go ahead and apply the induction lemmas. All the whilst making sure that the rhs be the same P n.

Simon Hudon (Jul 07 2020 at 02:50):

Try and work with two equations at a time. No need for such a big conjunction. If you have two equations, remove the universals so that theirs bound variables become free variables. You should get something like this:

f a b c = RHS0
f d e g = RHS1

that is an equality whose lhs is a function applied to multiple arguments

Daniel Fabian (Jul 07 2020 at 02:50):

the idea of introducing fresh variables, however doesn't create the same P. The inferred P n from the case for zero is P n = n + n. (Because n is known to be zero, whereas the succ case comes up with P n = n + zero.

Simon Hudon (Jul 07 2020 at 02:53):

That's ok. I think you may have to accept that results may sometimes be imperfect. That can be improved over time but let's go with something simple first

Daniel Fabian (Jul 07 2020 at 02:53):

(I did the conjunction mostly so I can look at the term) Can you even add a term with free variables into the context?

Simon Hudon (Jul 07 2020 at 02:55):

Yes but don't always need to add your terms to the context. Oh and by free variables, I mean local_const not var

Daniel Fabian (Jul 07 2020 at 02:56):

ok sure, that makes sense as the final form of how it's supposed to run. For now, I use the context as a debugging tool.

Daniel Fabian (Jul 07 2020 at 02:56):

it does pretty-printing and I can use tactics to massage terms as needed to try things out.

Simon Hudon (Jul 07 2020 at 02:57):

That's reasonable

Daniel Fabian (Jul 07 2020 at 02:58):

but we do need to apply the induction principle, right?

Daniel Fabian (Jul 07 2020 at 02:58):

or am I missing something?

Simon Hudon (Jul 07 2020 at 03:01):

The induction principle is not the next thing though

Daniel Fabian (Jul 07 2020 at 03:02):

sure, now we need to go and turn subterms into def_eq pairs.

Daniel Fabian (Jul 07 2020 at 03:02):

(one nice things about doing all of them at once, btw, was that it means fewer substitutions overall)

Simon Hudon (Jul 07 2020 at 03:18):

That’s right.

(Yes, but sometimes it’s easy to over-constrain your solution and make hard problems harder. And the additional constraints may not address the important bottleneck)

Daniel Fabian (Jul 07 2020 at 03:23):

I see. My intermediate strategy was basically to quantify over the conjunction. Creating an forall [fresh_var], rest at the root level, and specializing each quantifier away at the inner level. That way all the bound variables become free (under the global quantifiers).

Daniel Fabian (Jul 07 2020 at 03:24):

Not quite sure how to do it without that whilst maintaining well-formed terms.

Daniel Fabian (Jul 07 2020 at 03:25):

after that, I can star replacing the sub-expressions as you suggested.

Daniel Fabian (Jul 07 2020 at 03:26):

and that'll unify the right-hand sides that should be unified.

Daniel Fabian (Jul 07 2020 at 03:28):

I wonder if I could even just prove a lemma and use it as a simplifier rule... something like (forall x, P x) /\ Q <-> forall y, P y /\ Q.

Daniel Fabian (Jul 07 2020 at 03:29):

If I use this as a lemma for the simplifier, that should help.

Simon Hudon (Jul 07 2020 at 03:35):

I wouldn’t go that route. The terms will be easier to manipulate and traverse without the forall (you can keep a list of variables on the side) and when you’re ready to generate definitions, lemmas and proofs, you add the forall and lambdas back in

Daniel Fabian (Jul 07 2020 at 03:36):

i see.

Simon Hudon (Jul 07 2020 at 03:37):

I’ll have to let you experiment with the idea now. My bed calls to me (and won’t leave a voicemail)

Daniel Fabian (Jul 07 2020 at 03:37):

same start, though, right? we just arbitrarily introduce fresh variables and quantify away each quantifier, right?

Daniel Fabian (Jul 07 2020 at 03:37):

Of course, have a goodnight.

Daniel Fabian (Jul 07 2020 at 03:37):

I'll go to bed soon too.

Simon Hudon (Jul 07 2020 at 03:39):

That’s right. When the constructors around arguments match you leave them in otherwise you replace them with a fresh variable

Daniel Fabian (Jul 07 2020 at 18:29):

not quite sure how much time I'll have tonight, but the next step I want to work on is introducing the new variables, so that we can show the RHS is equal. When looking at the equations, though I noticed, that the thing we're trying to prove is false. In the following sense:

Daniel Fabian (Jul 07 2020 at 18:31):

given a definition like this

def balance {α} : color  node α  α  node α  node α
| color.black (node.tree color.red (node.tree color.red a x b) y c) z d :=
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
| color a x b := node.tree color a x b

we would like to prove forall a x b y c z d, balance color.black (node.tree color.red (node.tree color.red a x b) y c) z d = node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d), which is true.

but also forall color a x b, balance color a x b = node.tree color a x b, which is false.

Daniel Fabian (Jul 07 2020 at 18:32):

The second case only becomes simple when considered in a context. We must know that the first case didn't hold.

Daniel Fabian (Jul 07 2020 at 18:34):

So it turns out we need a proof like this: forall x, first_case x, ¬(forall x, first_case) -> (forall x, second_case x)

Daniel Fabian (Jul 07 2020 at 18:35):

So the idea of building a proposition is really nice, but we need to detect the order of the RHS, too.

Simon Hudon (Jul 07 2020 at 18:38):

That's right. The other thing that we can do is have predicates that correspond to constructors. We can have is_cons and is_nil for a list so that, when you unify two cases, you have to update an assumption:

foo (cons x xs) z := A
foo nil z := A

would then be encoded as:

(is_cons y \/ is_nil y) -> foo y z = A

Simon Hudon (Jul 07 2020 at 18:39):

And here, we can see that the disjunction is exhaustive and leave it out

Daniel Fabian (Jul 07 2020 at 19:15):

is there a convenient way of working with free variables? So far having things in the proof state was quite nice, because I could see the terms.

Simon Hudon (Jul 07 2020 at 19:19):

You'd have to keep printing your term but otherwise, it's straightforward. Anything specific you're unclear about?

Daniel Fabian (Jul 07 2020 at 19:21):

Not exactly, I just presume I can't add them to the local context, because I could potentially construct bogus info, so I'd have to work by just printing. Having them in the local context is really just a convenience thing.

Daniel Fabian (Jul 07 2020 at 19:22):

i suppose I could create fresh variables and prove them using sorry in the tactic.

Daniel Fabian (Jul 07 2020 at 19:22):

that way the terms would at least be well-formed

Simon Hudon (Jul 07 2020 at 19:25):

I see what you mean. The proof state and the terms are not strictly controlled. You can do a whole lot of operations on them that could create an invalid proof. That's not really a problem because when you add a declaration, Lean makes sure that it type checks and that it has no meta variable. How you get there is your own business. It's good to stay close to something that constitutes a valid proof but if you need to take liberties, Lean won't keep you until you add a declaration

Daniel Fabian (Jul 07 2020 at 19:26):

ah, ok, thanks. I'll work on that.

Daniel Fabian (Jul 07 2020 at 19:49):

random only semi-related question. Do we have a union-find data structure?

Simon Hudon (Jul 07 2020 at 19:50):

There is a specialized version of it for the tfae tactic (the following are equivalent, finds strongly connected components in a graph of implications)

Daniel Fabian (Jul 07 2020 at 19:55):

cool

Daniel Fabian (Jul 07 2020 at 22:03):

hmm, having trouble doing a beta reduction. I've got the meta var, but it can't infer a type.

Daniel Fabian (Jul 07 2020 at 22:04):

infer type failed, function expected at
  ( {α : Type ?} (left : node α) (val : α) (right : node α) (x : α) (b : node α),
     balance color.black (node.tree color.black left val right) x b =
       node.tree color.black (node.tree color.black left val right) x b)
    ?m_1
term
   {α : Type ?} (left : node α) (val : α) (right : node α) (x : α) (b : node α),
    balance color.black (node.tree color.black left val right) x b =
      node.tree color.black (node.tree color.black left val right) x b
has type
  Prop

Daniel Fabian (Jul 07 2020 at 22:05):

I shouldn't have to manually unfold the Pi, should I?

Simon Hudon (Jul 07 2020 at 22:07):

I think you're making a mistake here. You have a term which is ∀ {α : Type ?}, ..., whose type is Prop and you're trying to use function application on it but you'd need a term of the form λ {a}, ..., whose type is ∀ {a}, .... Do you see the difference?

Daniel Fabian (Jul 07 2020 at 22:07):

sure I do, but I don't getit , because the lemma is a constant.

Daniel Fabian (Jul 07 2020 at 22:08):

Is there a way for me to get the proof?

Daniel Fabian (Jul 07 2020 at 22:09):

nevermind, I'm being silly, the constant of course does have that type...

Daniel Fabian (Jul 07 2020 at 22:09):

so will the beta reduction then unfold the const?

Daniel Fabian (Jul 07 2020 at 22:14):

omg, now it starts to dawn on me... I'm manipulating terms, but am trying to change the shape of the type... Why didn't I notice earlier, sorry about the confusion.

Simon Hudon (Jul 07 2020 at 22:18):

No worries, one lesson at a time :)

Daniel Fabian (Jul 07 2020 at 22:19):

it's one thing to prove things and a whole another level to write meta programs ><

Daniel Fabian (Jul 07 2020 at 22:19):

but I am fully intent on making this work, anyway :)

Simon Hudon (Jul 07 2020 at 22:22):

I think that's a worthwhile exercise to go through

Daniel Fabian (Jul 07 2020 at 22:23):

yes. And also the knowledge will be quite useful for Lean 4, as there's more in lean code there and we can continue to make the lean story better.

Daniel Fabian (Jul 07 2020 at 22:24):

For math I already really like it, for program verification this cases thing is still quite annoying.

Daniel Fabian (Jul 07 2020 at 22:24):

Once we make a bit more automation, though... it'll be just awesome :D

Simon Hudon (Jul 07 2020 at 22:28):

That's my feeling too. There are too few people who care about software verification in Lean so it all falls on the few who do care about. Maybe, little by little, we can make it more appealing to computer scientists

Daniel Fabian (Jul 07 2020 at 23:31):

heh:

α : Type ?,
_eqn_6 :
  balance color.black (node.tree color.black ?m_2 ?m_3 ?m_4) ?m_5 ?m_6 =
    node.tree color.black (node.tree color.black ?m_2 ?m_3 ?m_4) ?m_5 ?m_6,
_eqn_5 :
  balance color.black (node.tree color.red (node.tree color.black ?m_8 ?m_9 ?m_10) ?m_11 ?m_12) ?m_13 ?m_14 =
    node.tree color.black (node.tree color.red (node.tree color.black ?m_8 ?m_9 ?m_10) ?m_11 ?m_12) ?m_13 ?m_14,
_eqn_4 :
  balance color.black (node.tree color.red (node.tree color.red ?m_16 ?m_17 ?m_18) ?m_19 ?m_20) ?m_21 ?m_22 =
    node.tree color.red (node.tree color.black ?m_16 ?m_17 ?m_18) ?m_19 (node.tree color.black ?m_20 ?m_21 ?m_22),
_eqn_3 :
  balance color.black (node.tree color.red node.leaf ?m_24 ?m_25) ?m_26 ?m_27 =
    node.tree color.black (node.tree color.red node.leaf ?m_24 ?m_25) ?m_26 ?m_27,
_eqn_2 : balance color.black node.leaf ?m_29 ?m_30 = node.tree color.black node.leaf ?m_29 ?m_30,
_eqn_1 : balance color.red ?m_32 ?m_33 ?m_34 = node.tree color.red ?m_32 ?m_33 ?m_34

Daniel Fabian (Jul 07 2020 at 23:32):

so next, I can go call-site by call-site introducing new variables, but this time round, we need to also replace sub-exprs in other equations.

Simon Hudon (Jul 07 2020 at 23:42):

I'm not sure if we have the same understanding. What I was suggesting was to take the equations, two by two, zip the argument list of their lhs and find a conservative generalization, i.e. if both are cons ? ? then you keep them as cons ? ?. If they are different constructors or that one is a constructor and the other is not, you introduce a new variable

Daniel Fabian (Jul 07 2020 at 23:43):

oh I do understand, but I can't do that under quantifiers.

Daniel Fabian (Jul 07 2020 at 23:43):

so I first just replaced all quantifiers with free variables.

Simon Hudon (Jul 07 2020 at 23:43):

That's right, that's the right thing to do

Daniel Fabian (Jul 07 2020 at 23:44):

here now, e.g. m_6 and m_14 can be unified.

Daniel Fabian (Jul 07 2020 at 23:44):

or m_14 and m_22

Simon Hudon (Jul 07 2020 at 23:47):

From which equations?

Daniel Fabian (Jul 07 2020 at 23:49):

eqn_6 and eqn_5 have m_6 and m_14, respectively at the same place.

Daniel Fabian (Jul 07 2020 at 23:49):

but they do have different prefixes, though.

Daniel Fabian (Jul 07 2020 at 23:50):

are the dependent types strictly left-to-right?

Simon Hudon (Jul 07 2020 at 23:51):

They are strictly left to right. I have not considered dependent types so far.

Daniel Fabian (Jul 07 2020 at 23:52):

eqn_1 and eqn_2 are an interesting bunch, for instance.

Simon Hudon (Jul 07 2020 at 23:53):

How so?

Daniel Fabian (Jul 07 2020 at 23:53):

their rhs would become node.tree u x y z.

Daniel Fabian (Jul 07 2020 at 23:53):

but that is in fact too general, we can't prove it.

Daniel Fabian (Jul 07 2020 at 23:55):

they both are the same case in the code, but the equation only holds, because we have considered the other cases first in the pattern matching.

Daniel Fabian (Jul 07 2020 at 23:55):

so this is an instance of the dependent case, really.

Simon Hudon (Jul 07 2020 at 23:56):

Is it because tree is an inductive family? Can you show me the definition again?

Daniel Fabian (Jul 07 2020 at 23:57):

def balance {α} : color  node α  α  node α  node α
| color.black (node.tree color.red (node.tree color.red a x b) y c) z d :=
    node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
-- | color.black (node.tree color.red a x (node.tree color.red b y c)) z d :=
--     node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
-- | color.black a x (node.tree color.red (node.tree color.red b y c) z d) :=
--     node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
-- | color.black a x (node.tree color.red b y (node.tree color.red c z d)) :=
--      node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)
| color a x b := node.tree color a x b

Daniel Fabian (Jul 07 2020 at 23:57):

the problem is, that the second pattern match is color for arbitrary color.

Daniel Fabian (Jul 07 2020 at 23:58):

But we know from the context, that it's either red and anything, or black, but not the previous case.

Daniel Fabian (Jul 07 2020 at 23:58):

that's why the rhs becomes simple.

Daniel Fabian (Jul 07 2020 at 23:58):

It's the fall-through between cases that simplifies it.

Simon Hudon (Jul 08 2020 at 00:01):

Yes I see

Simon Hudon (Jul 08 2020 at 00:03):

I would do the unification from left to right and I would just give up (reject the unification) if generalizing one argument makes later arguments type incorrect

Daniel Fabian (Jul 08 2020 at 00:03):

i see

Simon Hudon (Jul 08 2020 at 00:05):

There might be a more aggressive solution but I think a simpler first solution will make it easier to get a testable prototype

Daniel Fabian (Jul 08 2020 at 00:09):

Ya, I'll work on it. At this point I have at least the general API down, so I more or less know how to construct terms. So soon I can think about the algorithm. The more we have coded up, the easier it is to criticize ;-)

Simon Hudon (Jul 08 2020 at 00:09):

Indeed :)

Daniel Fabian (Jul 08 2020 at 01:41):

Actually I don't think we can unify left to right. When I'm doing the transformation we want to code up by hand, It keeps smearing the wrong cases and thus make the theorem false. I think have to unify inside-out, keeping careful track of what's around in the context.

Daniel Fabian (Jul 08 2020 at 03:20):

(deleted)

Daniel Fabian (Jul 08 2020 at 03:22):

(deleted)

Daniel Fabian (Jul 08 2020 at 03:33):

(deleted)

Daniel Fabian (Jul 08 2020 at 03:34):

(deleted)

Daniel Fabian (Jul 08 2020 at 03:34):

(deleted)

Daniel Fabian (Jul 08 2020 at 04:07):

At the end of the day, I think it may well be easiest to use the lambda in the definition itself and analyse its structure.

Daniel Fabian (Jul 08 2020 at 04:07):

All the cases are already handled correctly. So the main question that remains, is if can have a better decision procedure by ordering the cases in the type better.

Simon Hudon (Jul 08 2020 at 04:10):

It's worth exploring when you have something that works. Feel free to put it up on a git repo. It will be easier to discuss

Daniel Fabian (Jul 08 2020 at 04:12):

yup will do.

Daniel Fabian (Jul 09 2020 at 00:50):

I've made a good deal of progress on this, though not quite working yet.

Daniel Fabian (Jul 09 2020 at 00:51):

I also changed the strategy a bit. instead of trying to infer the spec, I instead as the user for an equality to prove.

Daniel Fabian (Jul 09 2020 at 00:52):

I can then use normal reduction until the reduction gets stuck, at which point I can then introduce the case split as need to construct a proof.

Simon Hudon (Jul 09 2020 at 00:53):

That sounds like a good plan. Do you have examples that show off what you've done so far?

Daniel Fabian (Jul 09 2020 at 00:54):

meta def inline_body : expr  tactic expr
| (app l r) := inline_body l >>= λ l, pure $ app l r
| (const n ls) := do
    decl  tactic.get_decl n,
    return decl.value
| e := tactic.trace e >> tactic.failed

meta def specialize_mvar : expr  expr    option expr
| mv' mv@(mvar _ _ _) _ := some $ mv.app mv'
| _ _ _ := none

meta mutual def break_case, eq_proof
with break_case : expr  expr  expr  tactic expr
| eq e@(app x mv) y := do
    ty_next  tactic.infer_type x,
    tactic.trace ty_next,
    tactic.trace x,
    tactic.trace y,
    x, mv, y 
        match ty_next.is_pi with
        | tt := do
            mv'  tactic.mk_mvar,
            return (x.replace (specialize_mvar mv'), mv', y.replace (specialize_mvar mv'))
        | ff := pure (x, mv, y)
        end,
    return e
| eq e _ := do
    tactic.trace e,
    tactic.fail "failed to break cases"
with eq_proof : expr  expr  expr  tactic expr
| eq x y := do
    is_eq  succeeds $ tactic.is_def_eq x y,
    match is_eq with
    | tt := pure $ (eq.app x).app y
    | ff := do
        x  tactic.whnf x,
        break_case eq x y
    end
    >>= λ x, pure $ (eq.app x).app y

meta def pattern_match_proof : expr  tactic expr
| e@(pi name bi ty body) := do
    body  pattern_match_proof body,
    return $ lam name bi ty body
| e@(app (app eq l) r) := do
    mvars  tactic.mk_mvar_list e.get_free_var_range,
    l  inline_body (l.instantiate_vars mvars),
    eq_proof eq l (r.instantiate_vars mvars)
| e := do
    tactic.trace e,
    tactic.fail "cannot synthesize equality proof."

meta def my_tac2 : tactic unit :=
do  tactic.target
    >>= pattern_match_proof
    >>= tactic.instantiate_mvars
    >>= tactic.exact

Daniel Fabian (Jul 09 2020 at 00:55):

here's what I have so far. It takes care of quantifiers, can use whatever was already available in the context, so it doesn't have to match the spec exactly.

Daniel Fabian (Jul 09 2020 at 00:56):

i'm at a point, where it doesn't go further somewhere deeply nested.

Daniel Fabian (Jul 09 2020 at 00:56):

I think, I need to pass the types along with meta vars.

Daniel Fabian (Jul 09 2020 at 00:57):

because bound quantifiers turn to de Bruijn vars and I can't reduce using them.

Simon Hudon (Jul 09 2020 at 00:57):

Right but is there an example where you can use that, no matter how simple, that shows what it does?

Daniel Fabian (Jul 09 2020 at 00:58):

e.g this. Given the goal:

  (a : node α) (x : α) (b : node α) (y : α) (c : node α) (z : α) (d : node α),
    balance color.black (node.tree color.red (node.tree color.red a x b) y c) z d =
      node.tree color.red (node.tree color.black a x b) y (node.tree color.black c z d)

Daniel Fabian (Jul 09 2020 at 00:59):

it infers the source code of balance, gets rid of quantifiers.

Daniel Fabian (Jul 09 2020 at 01:00):

but it doesn't finish and so I dont get a proof, i.e. it doesn't type check.

Daniel Fabian (Jul 09 2020 at 01:00):

lemme construct an example that should go through.

Daniel Fabian (Jul 09 2020 at 01:06):

def foo :     
| 0 1 := 1
| 1 0 := 1
| n 4 := 2
| n 3 := 2
| n m := 1

example :  x, foo x 1 = 1 :=
begin
    my_tac2
end

Daniel Fabian (Jul 09 2020 at 01:06):

this example, it does some work and gets stuck where I need to call rec

Daniel Fabian (Jul 09 2020 at 01:07):

Π (n : decidable (?m_1 = 0)), (λ (_x : decidable (?m_1 = 0)), ) n
decidable.rec
  (λ (hnc : ¬?m_1 = 0),
     ite (?m_1 = 1)
       (ite (1 = 0) (id_rhs  1) (ite (1 = 4) (id_rhs  2) (ite (1 = 3) (id_rhs  2) (id_rhs  1))))
       (ite (1 = 4) (id_rhs  2) (ite (1 = 3) (id_rhs  2) (id_rhs  1))))
  (λ (hc : ?m_1 = 0),
     ite (1 = 1) (id_rhs  1) (ite (1 = 4) (id_rhs  2) (ite (1 = 3) (id_rhs  2) (id_rhs  1))))
1

Daniel Fabian (Jul 09 2020 at 01:07):

so it does the unfolding, the quantifiers and introduces meta vars.

Daniel Fabian (Jul 09 2020 at 01:08):

oh and btw all that from the prop.

Daniel Fabian (Jul 09 2020 at 01:11):

what I'm not quite sure is a great idea is building the lambda terms directly by hand.

Daniel Fabian (Jul 09 2020 at 01:11):

in a sense, it might be easier to use the local context.

Simon Hudon (Jul 09 2020 at 01:18):

What do you mean by "building the lambda term by hand" vs "using the local context"?

Daniel Fabian (Jul 09 2020 at 01:19):

the way how this works is done by a recursive descent through the lambda. The effect being, that I have to do all term manipulation myself. Whereas many other tactics define local constants, goals, etc.

Simon Hudon (Jul 09 2020 at 01:21):

Right so that's the difference between a stateful versus functional approach. Stateful is especially useful when you want to use pieces of the API like induction. Even that part of the API should become more functional in Lean 4

Daniel Fabian (Jul 09 2020 at 01:23):

In a sense, we've gone full circle at this point, lol. The whole discussion started with wanting to have a tactic that can solve these pattern match cases. Then we found that using some combinators you can script it and solve the problem, but it's somewhat unsatisfying.

Daniel Fabian (Jul 09 2020 at 01:23):

Then we had the idea of an iductively defined prop which pretty much needs the low-level api

Daniel Fabian (Jul 09 2020 at 01:24):

and now this approach is back to more like just a tactic script.


Last updated: Dec 20 2023 at 11:08 UTC