Zulip Chat Archive

Stream: new members

Topic: Induction vs recursion


Patrick Stevens (May 09 2020 at 10:50):

I'm still struggling to get my head around how to perform proofs by induction, sadly. I've defined the set {0..n} and am trying to prove that it's an interval in N. In Agda this is extremely easy because I can perform recursive calls, but in Lean I haven't yet been able to find a proof of the following:

import data.list
import data.nat.prime
import tactic

def range : nat  finset nat
| 0 := finset.singleton 0
| (nat.succ n) := insert (nat.succ n) (range n)

lemma range_closed (n : nat) (a : nat) (aIn : a  range n) (v : nat) (vLess : v  a) : v  range n :=
begin
  induction n,
  unfold range,
  unfold range at aIn,
  simp at aIn,
  simp,
  linarith,
  unfold range,
  unfold range at aIn,
  simp,
  simp at aIn,
  cases aIn,
  {
    -- but the inductive hypothesis doesn't apply!
    sorry
  },
  right,
  cc,
end

In Agda, this is very easy; you just let it normalise your definitions, and whenever it gets stuck, you pattern-match on whatever it's stuck on, and eventually it bottoms out at a single lemma "n is in {0..n}" and a recursive call to range_closed; I can define the entire thing in 92 lines with no standard library and no thought required (https://gist.github.com/Smaug123/b5b2407176328756189d5f5e8918f03a, actual content is lines 69 through 92), though admittedly my definition of "finite set" leaves much to be desired.

Presumably it is similarly easy in Lean, but it doesn't match my intuition at all. I know in theory there's a way to replace my desired recursive call with an appeal to an appropriately-constructed inductive hypothesis, and I can probably find the general method if I think hard for an hour, but what's stopping Lean from doing these structurally-decreasing recursive calls itself?

Reid Barton (May 09 2020 at 11:18):

can you give some hint about Agda with?

Reid Barton (May 09 2020 at 11:19):

Is this match ... with ...?

Reid Barton (May 09 2020 at 11:19):

Everything in your Agda code was easy to read until the last definition, where things got really weird. Maybe this is related to your problem :upside_down:

Reid Barton (May 09 2020 at 11:26):

What happened to range_contains?

Reid Barton (May 09 2020 at 11:29):

I assume record {} is the proof of True?

Reid Barton (May 09 2020 at 11:30):

Things are obviously going to be somewhat harder because the definition of contains for an actual finite set won't compute the way it does in your code, I think

Reid Barton (May 09 2020 at 11:37):

Or much harder

Reid Barton (May 09 2020 at 11:37):

I thought for a while your Agda proof was not correct.

Reid Barton (May 09 2020 at 11:38):

The more I read it, the less I understand...

Reid Barton (May 09 2020 at 11:40):

On line 86, we know n is 0, a is not 0, v is not 0. How can we use cont as the proof?

Reid Barton (May 09 2020 at 11:48):

Anyways, the tactic proof can be repaired by starting with

  induction n with n IH generalizing a aIn v vLess,

(with n IH just gives better names to the new variables)

Reid Barton (May 09 2020 at 11:49):

generalizing ... means basically revert all the variables, apply induction, then re-intros them in all the cases. That way you get an induction hypothesis that can be applied to any a, ....

Reid Barton (May 09 2020 at 11:54):

Also, I guess as a high-level comment, surely it's better strategy to just prove that a ∈ range n is equivalent to a <= n first, rather than embark on an inductive proof of this statement that involves three variables directly

Mario Carneiro (May 09 2020 at 11:57):

Reid Barton said:

On line 86, we know n is 0, a is not 0, v is not 0. How can we use cont as the proof?

Aren't n, a and v all 0 at that point?

Reid Barton (May 09 2020 at 12:00):

it might be my inability to read this syntax, but I thought that line 84 is the counterpart to line 82, depending on the outcome of a=0

Reid Barton (May 09 2020 at 12:04):

I think the structure is:

  • split on a = 0 (not by pattern matching but with the ℕDecideEquality thing--but somehow Agda also decides that a=0, maybe this is what I missed later)
    • when a=0, split on the proof of v<=a
    • when a!=0, split on ℕDecideEquality v 0

Reid Barton (May 09 2020 at 12:05):

Well, this is the n=0 case

Reid Barton (May 09 2020 at 12:05):

Like I think the ... on line 85 corresponds to the ... | inr a!=0 on line 84, otherwise I can't make sense of it

Mario Carneiro (May 09 2020 at 12:11):

I get this for the translation of 79-86 to lean syntax

def rangeClosed :  (n a : ), contains DecideEquality (range n) a   (v : ), (v  a)  contains DecideEquality (range n) v
| zero a cont v (h : v  a) :=
  match a, cont, v, h, DecideEquality a 0 with
  | .(0), cont, v, inl (h : v<0), inl rfl := exFalso (notLessZero v h)
  | .(0), cont, v, inr rfl, inl rfl := ⟨⟩
  | a, cont, v, inr rfl, inr (h : a  0) :=
    match a, cont, v, h, DecideEquality v 0 with
    | a, cont, v, h, inl (h₂ : v = 0) := ⟨⟩
    | a, cont, v, h, inr (h₂ : v  0) := cont
    end
  end
...

Reid Barton (May 09 2020 at 12:14):

Does it work?

Reid Barton (May 09 2020 at 12:15):

Or did you manage to type all this without feedback from lean?

Reid Barton (May 09 2020 at 12:16):

The only way I can imagine cont : contains ℕDecideEquality (range n) a has type contains ℕDecideEquality (range n) v here is that Agda actually decided this case is impossible

Reid Barton (May 09 2020 at 12:17):

or I guess that it reduced both of these types to False... that is more plausible

Mario Carneiro (May 09 2020 at 12:17):

No feedback

Reid Barton (May 09 2020 at 12:17):

I still don't understand how it could do that though

Mario Carneiro (May 09 2020 at 12:17):

I'm working on a compiling version

Mario Carneiro (May 09 2020 at 12:18):

but the lean definitions are different in a few places, unless I replicate the whole definition from nothing as was done in the original

Reid Barton (May 09 2020 at 12:20):

I guess I don't know how matching on ℕDecideEquality a 0 actually works. It seems like for the Agda code to work, it needs to have already unfolded the definition of the type of cont to see that it contains ℕDecideEquality a 0, so that it can be replaced by a constructor in the branches

Mario Carneiro (May 09 2020 at 12:21):

I'm pretty sure that it's just like obtain rfl | h := decidable.em (a = 0)

Reid Barton (May 09 2020 at 12:23):

But in that case, in your Lean translation, | a, cont, v, h, inr (h₂ : v ≠ 0) := cont is clearly a type error right?

Mario Carneiro (May 09 2020 at 12:24):

Oh, actually it might not be, if that's simultaneously matching on the ℕDecideEquality a 0 inside the definition of contains

Mario Carneiro (May 09 2020 at 12:24):

in lean you would have to unfold contains first

Reid Barton (May 09 2020 at 12:24):

I think that must be what is happening but I don't know how Agda knows to do that

Reid Barton (May 09 2020 at 12:25):

I mean it cannot unfold everything in scope as far as possible before matching on anything, right? Or can it?

Mario Carneiro (May 09 2020 at 12:26):

"The user said it typechecks, so by golly I'll make it typecheck"

Reid Barton (May 09 2020 at 12:54):

Here is how I might prove the original lemma:

lemma nat.le_succ_iff (a b : nat) : a  b.succ  a  b  a = b.succ :=
nat.of_le_succ, λ h, h.elim nat.le_succ_of_le le_of_eq

lemma mem_range_iff_le (n a : nat) : a  range n  a  n :=
begin
  induction n with n IH,
  { simp [range] },
  { simp [range, IH, nat.le_succ_iff, or.comm] },
end

lemma range_closed (n : nat) (a : nat) (aIn : a  range n) (v : nat) (vLess : v  a) : v  range n :=
begin
  rw mem_range_iff_le at  aIn,
  exact le_trans vLess aIn
end

Reid Barton (May 09 2020 at 12:54):

obviously I relied a bit on the standard library, but not heavily

Patrick Stevens (May 09 2020 at 13:45):

I don't mind the use or not of the standard library - I'm not trying to whinge or anything, and the point wasn't "oh look I've been coding for two years in Agda after spending months getting used to it, why am I finding Lean less intuitive waaaah"

Patrick Stevens (May 09 2020 at 13:49):

I'm aware that Agda, like all theorem proving environments, is completely unreadable :P Agda by default automatically unfolds everything as far as possible before presenting it to you, although you don't have to fill "unfolded" holes with "unfolded" terms - you can fill a hole that Agda is showing to you as fully unfolded, using a nice compact term

Reid Barton (May 09 2020 at 13:50):

Well, I don't totally agree with the first statement. I think the proof I wrote above is a lot easier to understand, even putting concrete syntax aside--simply because it follows the correct conceptual strategy for a human

Reid Barton (May 09 2020 at 13:51):

I believe you when you say your proof was easy to write without thinking (I know the Agda emacs mode has some fancy auto refine/hole filling stuff), but it resulted in something quite difficult to understand

Reid Barton (May 09 2020 at 13:51):

It goes to show that machines producing proofs that humans cannot understand is not some science fiction thing.

Patrick Stevens (May 09 2020 at 14:00):

(Random aside: can it really be true that le_total has the most useful type signature? Surely, surely a <= b or b < a is a more useful phrasing?)

Patrick Stevens (May 09 2020 at 14:04):

Honestly I struggle to see that there's anything to prove here at all - I wouldn't say there's a "correct strategy for a human" beyond saying "induction, the rest is trivial", so I don't feel particularly moved by "it follows the correct conceptual strategy for a human". It may very well be that Lean makes for better-structured proofs on actual real problems, but then to me the structure of a formally-checked proof is dictated more by the lemmas which are chained together than by the internal structure of any particular theorem's proof. Possibly that's just my Agda heritage talking.

Patrick Stevens (May 09 2020 at 14:05):

But my Agda proof has a structure that is fairly clear when you get past the syntax: "induct on n; clear the base case. Check if v = n; if so, we're done straight away. If not, induct down, although booo fiddly edge case if a = n already."

Reid Barton (May 09 2020 at 14:07):

I disagree rather thoroughly with all this, but I'm not sure there is anything constructive to say. It looks like "alien mathematics" again.

Mario Carneiro (May 09 2020 at 14:07):

Patrick Stevens said:

(Random aside: can it really be true that le_total has the most useful type signature? Surely, surely a <= b or b < a is a more useful phrasing?)

All of the variations on these statements exist: le_or_lt, lt_or_le, lt_trichotomy, etc

Mario Carneiro (May 09 2020 at 14:09):

I think I see where you are speaking from, it reminds me of how most list lemmas are proven. But once you have a decent nontrivial equality relation it becomes easier to prove things by rw and simp instead rather than using induction for everything

Mario Carneiro (May 09 2020 at 14:10):

data.list.basic is quite large, exactly because it should be easy to pick up and use without having to resort to induction all the time

Patrick Stevens (May 09 2020 at 14:11):

Mario Carneiro said:

All of the variations on these statements exist: le_or_lt, lt_or_le, lt_trichotomy, etc

Is there a way I can find those other than by a) intuiting their names, or b) writing their type signatures and proving them with library_search?

Mario Carneiro (May 09 2020 at 14:11):

those both sound like pretty good methods to me

Kenny Lau (May 09 2020 at 14:12):

#find

Mario Carneiro (May 09 2020 at 14:12):

another method is to write #print le_total, ctrl-click on le_total to go to its definition, and then browse around from there; I bet le_or_lt is within 3 statements of it

Patrick Stevens (May 09 2020 at 14:14):

Mario Carneiro said:

another method is to write #print le_total, ctrl-click on le_total to go to its definition, and then browse around from there; I bet le_or_lt is within 3 statements of it

That one I did try - but actually they're in different files from le_total

Mario Carneiro (May 09 2020 at 14:14):

checking... oh, I missed, le_or_gt is about 20 theorems later

Mario Carneiro (May 09 2020 at 14:14):

le_or_lt is in the mathlib addendum to that file

Reid Barton (May 09 2020 at 14:19):

I guess I'll just say this: as a human, what led you to come up with the statement rangeClosed in the first place, or why did you believe that it would be true before you tried to prove it in Agda? Wasn't it the fact that range n contains the numbers 0 through n, that is, those a that are <= n, and if a <= n and v <= a, then v <= n? Or did you imagine this inductive proof with five or six case distinctions in it?

Reid Barton (May 09 2020 at 14:20):

If the question is just "why can't the system prove this for me automatically" then I agree that any decent system ought to be able to do it, and I think there are some that can.

Mario Carneiro (May 09 2020 at 14:20):

By the way, "intuiting the name" is not unreasonable because of lean's fairly strict naming conventions

Mario Carneiro (May 09 2020 at 14:21):

for a good number of basic theorems you can just stick name segments together and guess the name exactly with high probability

Reid Barton (May 09 2020 at 14:22):

An important variation of intuiting the name is guessing most of the name and using autocompletion

Jalex Stark (May 09 2020 at 14:23):

Mario Carneiro said:

By the way, "intuiting the name" is not unreasonable because of lean's fairly strict naming conventions

A month ago i thought guessing the name was wizard magic, and then I read that file and started guessing and it was fine

Patrick Stevens (May 09 2020 at 14:26):

In Lean, I came up with rangeClosed as follows. 1) Attempt to formulate in Lean a certain lemma in the proof of Bertrand's postulate: namely, "the product of the primes less than or equal to n, is less than 4^n". 2) Realise I need to define the collection of primes less than or equal to n, and decide to go by filtering the collection of naturals less than or equal to n. 3) Begin to induct on n to prove that lemma, discover that I need to show that something is in the range at all.

I think by this point I had been flapping my hands at the keyboard for a while and that was a term which would fill a hole I had in scope, so I started trying to prove it. But I already knew it would be true because "look, it's obvious" (no further reason at all, not even the justification you gave above); if you'd asked me to prove it, I'd have said "induction on n" (because that's the only thing it makes sense to induct on, because that's where the definition of range varies), and without pencil and paper I'd have been a bit stumped if you'd asked me for further justification. With pencil and paper I could probably have written out something like what you said.

Mario Carneiro (May 09 2020 at 14:28):

Realise I need to define the collection of primes less than or equal to n, and decide to go by filtering the collection of naturals less than or equal to n. 3) Begin to induct on n to prove that lemma, discover that I need to show that something is in the range at all.

It seems to me from this description that you already believed that range n is the set of numbers less or equal to n

Mario Carneiro (May 09 2020 at 14:29):

so to my mind that indicates that this should be the first thing you prove about the definition, and hopefully the filter through which all other theorems follow

Patrick Stevens (May 09 2020 at 14:30):

Yes, that is precisely why I constructed range n - to be the object which had that property

Reid Barton (May 09 2020 at 14:31):

So then, surely, the first thing you should prove about it is that it does have that property

Patrick Stevens (May 09 2020 at 14:31):

I guess if I'd been less lazy and had been working with my software engineer's hat on, I'd have done that straight away - no code without its unit test

Reid Barton (May 09 2020 at 14:32):

Oh, Mario already said this.

Mario Carneiro (May 09 2020 at 14:35):

In mathlib this is set up as a simp lemma, so that by the time you are actually thinking about the problem the ranges are gone

import data.list.range

lemma range_closed (n : nat) (a : nat)
  (aIn : a  list.range n) (v : nat) (vLess : v  a) : v  list.range n :=
begin
  simp at *,
  exact lt_of_le_of_lt vLess aIn
end

Patrick Stevens (May 09 2020 at 14:36):

Oh, I didn't even realise there was a list.range, I assumed it would be a finset sort of thing

Mario Carneiro (May 09 2020 at 14:36):

the finset is of course defined using the list

Reid Barton (May 09 2020 at 15:07):

Anyways, @Patrick Stevens, I'm not sure if in all the other noise, you missed the answer using generalizing (or equivalently revert).

Patrick Stevens (May 09 2020 at 15:09):

Ah, I missed that - thanks


Last updated: Dec 20 2023 at 11:08 UTC