Zulip Chat Archive

Stream: mathlib4

Topic: The plan to remove induction'

Jeremy Tan (Mar 22 2025 at 06:10):

Jeremy Tan (Mar 22 2025 at 08:29):

#23211 is the first big automated commit in my plan to remove induction'. It targets mainly Nat induction, which by far is the most common target of induction' in current mathlib

Peter Nelson (Mar 22 2025 at 23:23):

Is there a clear consensus that induction' should be removed?

Jeremy Tan (Mar 22 2025 at 23:47):

Yes; see the other thread about cases'

Jeremy Tan (Mar 22 2025 at 23:50):

And #lean4 > To replace `induction' h : f x` @ 💬

Jeremy Tan (Mar 22 2025 at 23:52):

(now that the missing functionality has been added to plain induction and is available in current mathlib I am free to go remove induction')

Jeremy Tan (Mar 23 2025 at 01:35):

I acknowledge that not all the stated instances have been removed, but as 400+ of 1000 instances are removed by #23211 I'd really want to get this merged quickly so I can go again without suffering rot

Peter Nelson (Mar 23 2025 at 11:32):

The way I would read that discussion is that a particular obstacle to its removal was identified, and that now that obstacle is gone. But I wouldn’t call that a clear consensus.

Peter Nelson (Mar 23 2025 at 11:35):

induction' isn’t the same as cases', because there is no rcases or obtain-like syntax to replace it with.

Eric Wieser (Mar 23 2025 at 12:21):

Doesn't the fact that the PR was fully automated with a script mean that it's trivial to deal with rot? You just git reset --hard origin/master and rerun the script.

Peter Nelson (Mar 23 2025 at 13:10):

The thing that slightly concerns me here is that removing (or discouraging) induction' is pushing users towards equation-compiler-like syntax with | and =>, which is a departure from the dots and indents of the rest of tactic mode. While I now use induction myself, I think that induction' is often better for new users finding their feet, and unlike with the removal of cases', these users don't have alternatives.

I'm not saying the change shouldn't be made, but it feels like this is something a single user is pushing for, rather than where there is a clear statement from the maintainers that this change is intended.

Jeremy Tan (Mar 26 2025 at 01:54):

#23324 deprimes Data.List

Jeremy Tan (Mar 26 2025 at 01:55):

Peter Nelson said:

I'm not saying the change shouldn't be made, but it feels like this is something a single user is pushing for, rather than where there is a clear statement from the maintainers that this change is intended.

There is a clear approval for this change: #lean4 > To replace `induction' h : f x` @ 💬

Jeremy Tan (Mar 26 2025 at 02:03):

The situation with induction' and induction is like Python 2 and Python 3. All the major players in the software industry had deprecated Python 2 and moved on but there were a number of users irrationally clinging on to Python 2

Jeremy Tan (Mar 26 2025 at 02:16):

Peter Nelson said:

The thing that slightly concerns me here is that removing (or discouraging) induction' is pushing users towards equation-compiler-like syntax with | and =>, which is a departure from the dots and indents of the rest of tactic mode.

Diversity of syntax is a Good Thing for new users – it makes code visually more appealing and prevents misconceptions about tactic mode. | and => will help them learn the equation compiler

Johan Commelin (Mar 26 2025 at 02:24):

I don't see the "clear approval".
I think these refactors should be explored, but no decision has been made.

Jeremy Tan (Mar 26 2025 at 02:27):

@Jireh Loreaux left a comment on my last merged depriming PR (#22533): https://github.com/leanprover-community/mathlib4/pull/22533#discussion_r2007860604

I'm glad this is getting fixed. I think this is a particularly egregious example of the lack of explicit structure in induction'

Can we agree that uses of induction' with five or more introduced variables are "egregious" and should be deprimed?

Jeremy Tan (Mar 26 2025 at 02:31):

(And if zero variables are introduced, it can always be replaced straight up with induction)

Jeremy Tan (Mar 26 2025 at 09:20):

induction-uses
replace.py

Jeremy Tan (Mar 26 2025 at 09:23):

I beefed up my script. Now it leaves less than 450 instances of induction'

Jeremy Tan (Mar 26 2025 at 10:58):

(to be precise, 430)

Peter Nelson (Mar 26 2025 at 11:11):

@Jeremy Tan , I think I agree that removing induction' makes sense. But you are being impatient, and glibly dismissive of any resistance. Often people that act this way are in fact trying to push for something bad.

Jeremy Tan (Mar 26 2025 at 11:39):

Anyone trying to push through a large change has to be impatient to some extent – has to address the concerns of the resistance as far as they can, but ignore the remainder. Otherwise they will be quickly defeated

Jeremy Tan (Mar 26 2025 at 11:40):

I have addressed your concerns as far as I can here, and #23211 is now error-free

Peter Nelson (Mar 26 2025 at 13:13):

I think that Nat is one of the more defensible settings for the use of induction'.

In contrast to structural examples with a ton of introduced variables, induction on the natural numbers is intuitive to any mathematician without the concept of ‘deconstructing’ some n, and the induction' syntax is more sympathetic to that perspective.

Peter Nelson (Mar 26 2025 at 13:26):

Maybe tackling the examples with large numbers of variables, rather than Nat, would be a natural place to start?

Jireh Loreaux (Mar 26 2025 at 14:22):

Jeremy, my comment was specifcally about that use of induction' with many branches.

In general, I personally (not a mandate from the maintainers) prefer the extra structure with the induction tactic, even if it is slightly more verbose, especially because we have a code action for it. However, from the other thread, it was clear that removing induction' was not as desirable in all situations as removing cases' (thanks for doing that!). The reason I merged that one PR of yours about induction' is that all of those replacements did not seem to be relevant to the condition mentioned in the other thread. (I have not had time to review your other PRs on this topic, so I don't know about those.)

Jeremy Tan (Mar 26 2025 at 14:27):

I'm going to make a PR (or rather overwrite an existing PR) removing those induction' with five or more named variables – see https://github.com/leanprover-community/mathlib4/compare/master...option2

Jeremy Tan (Mar 26 2025 at 14:27):

I may later extend that to four variables

Jireh Loreaux (Mar 26 2025 at 14:28):

okay, but why not wait for some consensus on how to proceed? Instead of jumping the gun, first determine what the community wants.

Jeremy Tan (Mar 26 2025 at 14:29):

Only @Peter Nelson has given his thoughts and I'm not sure the community has any opinions on this

Jireh Loreaux (Mar 26 2025 at 14:30):

Johan Commelin said:

I don't see the "clear approval".
I think these refactors should be explored, but no decision has been made.

This is the message that should make you stop and think: "hmmm, maybe I should wait on this."

Jeremy Tan (Mar 26 2025 at 14:36):

But I might have to wait forever on that

Edward van de Meent (Mar 26 2025 at 14:37):

yes. and that's ok

Jireh Loreaux (Mar 26 2025 at 14:39):

Jeremy, I'm taking steps in the other thread now to see if we can come to a decision. But indeed, sometimes you (or even me!) have to wait indefinitely for certain things.

(For instance, I would love to see a global PartialOrder ℂ instance, but that hasn't won enough approval in the past to warrant the change.)

Jeremy Tan (Mar 26 2025 at 16:44):

Peter Nelson said:

Maybe tackling the examples with large numbers of variables, rather than Nat, would be a natural place to start?

#23324

Eric Wieser (Mar 26 2025 at 17:20):

Searching that PR for | h_ flags a number of induction principles whose arguments should be renamed (in a separate PR)

Aaron Liu (Mar 26 2025 at 17:48):

Eric Wieser said:

Searching that PR for | h_ flags a number of induction principles whose arguments should be renamed (in a separate PR)

I have a plan to go through all induction principles in mathlib by hand and rename them

Jeremy Tan (Mar 27 2025 at 03:23):

Eric Wieser said:

Searching that PR for | h_ flags a number of induction principles whose arguments should be renamed (in a separate PR)

#23358

Kim Morrison (Mar 27 2025 at 04:46):

@Jeremy Tan, if you're waiting on decisions here, let me invite you to join in our quest to reduce the use of erw in Mathlib.

The goal isn't to just replace it with other tactics or rely on unification to blast through the defeq problems, but instead to improve simp (better normal forms, better confluence) lemmas to avoid needing erw in the first place. There are >1000 erw still in the library. Have a look at recent PRs with erw in the title if you'd like inspiration.

(Of course, feel free to do anything else --- it just occurred to me that this is a project you might enjoy!)

Jeremy Tan (Apr 05 2025 at 03:51):

Could we support syntax like

cases x using someInductionOn with a b c
induction x using someInductionOn with a b c

just when someInductionOn has only one branch with new variables a b c? @Kyle Miller

Jeremy Tan (Apr 05 2025 at 03:52):

@Peter Nelson Would the above proposal be a suitable replacement for single-branch cases'/induction'?

Kyle Miller (Apr 05 2025 at 04:19):

We can do induction x using someInductionOn with | _ a b c. The problem with omitting | _ is that right after with the syntax is looking for a tactic.

Jeremy Tan (Apr 05 2025 at 04:20):

Peter said here that that "makes the line longer and introduces a lot of functional programming boilerplate"

Jeremy Tan (Apr 05 2025 at 04:21):

In fact the blanked-branch syntax is what Eric put into one of my prior depriming induction PRs

Jeremy Tan (Apr 05 2025 at 04:24):

How about cases1 and induction1?

Jeremy Tan (Apr 05 2025 at 04:53):

Or even

induct x using someInductionOn with a b c

(since induct would be a new tactic, with is not constrained to expect a tactic afterwards)

Edward van de Meent (Apr 05 2025 at 06:13):

I feel like creating a new tactic just to remove another defies the purpose?

Jeremy Tan (Apr 05 2025 at 07:02):

Edward van de Meent said:

I feel like creating a new tactic just to remove another defies the purpose?

Or we could make some new field (e.g. with1) for cases and induction to indicate "there is exactly one branch, here are its variables"

Jeremy Tan (Apr 05 2025 at 07:05):

(i.e. with1 or whatever we decide on cannot be used with with and will error if the inductor has more than one branch)

induction x using someInductionOn with1 a b c
=
induction' x using someInductionOn with a b c

Jeremy Tan (Apr 05 2025 at 07:16):

It must be understood that this is just a nice-looking shorthand for the blanked-branch syntax

Peter Nelson (Apr 05 2025 at 10:25):

Jeremy Tan said:

Peter said here that that "makes the line longer and introduces a lot of functional programming boilerplate"

The specific comment I made was that
cases p with | _ m s => ?_ looks worse than
induction’ p with m s.

I don’t have a strong view on what syntax is used, but in that specific instance, cases looks very clunky

Ruben Van de Velde (Apr 05 2025 at 11:06):

Yeah, that's pretty much the worst case and I wouldn't mind shortening it somehow. I'm not sure I'd add a whole separate tactic for it if it didn't exist yet, though

Eric Wieser (Apr 05 2025 at 11:29):

Ideally obtain or rcases would work here, but it doesn't support the custom eliminators

Peter Nelson (Apr 05 2025 at 12:10):

Is expanding obtain/rcases on the roadmap?

Peter Nelson (Apr 05 2025 at 12:27):

And if it is, what's the hurry to remove induction'?

Ruben Van de Velde (Apr 05 2025 at 12:27):

There is no hurry

Eric Wieser (Apr 05 2025 at 12:29):

I think there's a reasonable argument to start removing induction' where the replacement is obviously more readable, in that it's general cleanup and it identifies the places which are not more readable.

Eric Wieser (Apr 05 2025 at 12:30):

But I think there's an element of Goodhart's law going on here, in that I think this is included in the technical debt counters

Jeremy Tan (Apr 05 2025 at 17:53):

@Eric Wieser I do have #23686 ready for review, fixing more induction branch names (and motives)

Kyle Miller (Apr 05 2025 at 17:58):

What does renaming motives to motive do? I haven't seen any discussion about this.

There aren't any systems that require the motive to be named motive (you can tell it's a motive because the return type has it as the head function). There's an overly-specific pretty printer that looks for an argument named motive, but I've been thinking of fixing that.

I suppose it's nice knowing that an argument is the motive by looking at it, but on the other hand whenever you see some higher-order named argument like (argName := fun a b c => ...), very often it's something motive-like.

Jeremy Tan (Apr 05 2025 at 17:59):

Eric insisted on fixing up the motives in a previous PR on the same topic

Jireh Loreaux (Apr 05 2025 at 17:59):

There was a poll somewhere about this, and the result was motive is the preferred spelling.

Jireh Loreaux (Apr 05 2025 at 17:59):

(as opposed to p or P or whatever)

Jireh Loreaux (Apr 05 2025 at 18:00):

https://leanprover.zulipchat.com/#narrow/channel/287929-mathlib4/topic/Induction.20argument.20order/near/471521389

Edward van de Meent (Apr 05 2025 at 18:02):

from the looks of it, you even voted on the polls, Kyle :face_with_open_eyes_and_hand_over_mouth:

Aaron Liu (Apr 05 2025 at 18:04):

I have a pr #23489 that will rename all the motives I can find once I finish it

Kyle Miller (Apr 05 2025 at 18:07):

Thanks! I forgot about the poll.

(This is the core delaborator code that is overly restrictive with what it considers to be a motive argument.)

Edward van de Meent (Apr 05 2025 at 18:14):

the motive argument is just the argument which provides the head symbol for the returntype, right? (this seems to be how the machinery of induction does it?)

Edward van de Meent (Apr 05 2025 at 18:15):

i.e. if you use docs#Lean.Meta.getElimExprInfo you should get the right argument, i think?

Edward van de Meent (Apr 05 2025 at 18:16):

although i guess this does not capture induction with multiple motives

Kyle Miller (Apr 05 2025 at 18:39):

Yes @Edward van de Meent:
Kyle Miller said:

you can tell it's a motive because the return type has it as the head function

Mario Carneiro (Apr 05 2025 at 19:02):

Kyle Miller said:

you can tell it's a motive because the return type has it as the head function

That's not necessarily the case; in mutual recursors there are additional motives which are not the head of the return type

Edward van de Meent (Apr 05 2025 at 19:03):

Indeed
Edward van de Meent said:

although i guess this does not capture induction with multiple motives

Kyle Miller (Apr 05 2025 at 19:05):

I didn't meant that to be an if-and-only-if.

At least the motives all appear in the return types of the minor premises, right? I'm not sure though, maybe you need to look in the arguments of minor premises as well... In any case, multi-motive is a place where it doesn't work to name every motive motive.

Kyle Miller (Apr 05 2025 at 21:55):

lean4#7830 is a prototype testing out the change to the cases/induction syntaxes that I was suggesting.

With this, you can write

cases p with | _ m s

as as shorthand for

cases p with | _ m s => ?_

If you name cases, then you can also write

induction n with | zero | succ n ih
· tac1
· tac2

rather than

induction n with | zero => ?_ | succ n ih => ?_
· tac1
· tac2

It's not possible to write induction n with | _ | _ n ih. Somewhere @Mario Carneiro suggested that cases/induction should support multiple wildcards. Until today, I didn't realize that _ causes the alternative to apply to every remaining pattern. I find this to be unexpected, and I wonder if this should be changed; should _ mean "next remaining case" and have a new * to mean "every remaining case"?

Jeremy Tan (Apr 06 2025 at 01:48):

Kyle Miller said:

With this, you can write
cases p with | _ m s
as as shorthand for
cases p with | _ m s => ?_

I would like to see the shorthand I suggested for the one-branch case:

cases p with1 m s

Once again the conditions for using with1 are that there is only one branch and that with is not used

Kyle Miller (Apr 06 2025 at 19:15):

Introducing a new keyword to save three characters doesn't seem like something core Lean would go for.

cases p with | _ m s
cases p with1 m s

Maybe the rcases pattern syntax could accept named constructors in addition to anonymous constructor notation. That could give the alternatives

obtain Prod.mk m s := p
-- or perhaps
obtain mk m s := p
-- or, for short
obtain _ m s := p

(I'm assuming that the rcases system would know about custom cases eliminators. I'm not sure how rcases should accept using clauses. The point of rcases is that it's recursive deconstruction, so it seems to me that it shouldn't be a single top-level using clause.)

Potentially there could be a special pattern to do induction too. It's possible to have something like this work:

obtain induct(_ | ⟨n, ih⟩) := n

It's just something that came to mind, and I'm not sure I'd want to see it implemented. But, it's clear that it could be convenient sometimes:

obtain induct(⟨_|_|n, ih⟩ using Nat.strongRecOn) := n

That's strong induction on n and then immediately doing cases on n into the 0, 1, and n+2 cases.

Mario Carneiro (Apr 06 2025 at 19:18):

one possibility is something like haskell view patterns, like obtain (Nat.strongRecOn => ⟨n, ih⟩) := n. It gets a bit cluttered though, I worry a bit about it turning into something unreadable like ssreflect intro patterns

Kyle Miller (Apr 06 2025 at 19:20):

I worry too, but at least in Lean we get to set the tactic infos, letting people step through intermediate states to some degree; my understanding of ssreflect is that you can only see the before and after state.

Jeremy Tan (Apr 11 2025 at 03:44):

Kyle Miller said:

cases p with | _ m s

OK, I'll be content with this

Last updated: May 02 2025 at 03:31 UTC