Zulip Chat Archive

Stream: lean4

Topic: mathport

Daniel Selsam (Jan 29 2021 at 12:24):

@Gabriel Ebner I cannot reproduce your error, but I will create a Dockerfile that runs the entire pipeline to preempt this kind of issue.

Sebastian Ullrich (Jan 29 2021 at 13:08):

Already started working on a Nix build and filling a binary cache at https://app.cachix.org/cache/port34

Daniel Selsam (Feb 02 2021 at 04:32):

I am curious how far we can push heuristic delaboration. Suppose we are able to delaborate into:

terms (when reasonably small)
have/let
calc
induction
cases
rewrite
apply
simp
decision procedures (e.g. ring)
...?

Might these proofs be inspectable/maintainable enough to obviate the need for manual proof-porting? How many other tactics provide significant compression of the proof terms? Note: even this level of delaboration may be hard for various reasons, and may require a lot of search.

Daniel Selsam (Feb 02 2021 at 04:41):

There could be advantages to automatically producing structured proofs. I personally find it difficult to maintain/port proofs that use e.g. refine or apply without braces around the subgoals.

Mario Carneiro (Feb 02 2021 at 04:54):

we have a style guide about that, so it shouldn't come up often

Mario Carneiro (Feb 02 2021 at 04:58):

One way to get tactic proofs to delaborate is to have the tactic put a tag in the theorem saying the tactic's name and arguments. This is a little tricky in lean 3 though because the natural language for that is to put name and expr in the term, in identity-function annotations, but expr at least is meta and so can't appear in most proof terms

Rob Lewis (Feb 02 2021 at 10:13):

Yeah, I think focusing on subgoals is probably the least of the worries here. Not that all of mathlib does it perfectly, but it's one of the stricter style rules we have because of the obvious maintenance benefits

Rob Lewis (Feb 02 2021 at 10:14):

The tagging tactics idea is interesting, but it sounds like it would end up a weird hybrid of porting proof terms and porting surface syntax. Would it be easier to hack the Lean 3 parser to output Lean 4-ish proof scripts?

Rob Lewis (Feb 02 2021 at 10:16):

I guess delaboration would work best at the type level, ignoring proofs. What are the prospects of generating file skeletons automatically with this? For a manual porting effort it would still be really useful to have an outline of what you need to update.

Gabriel Ebner (Feb 02 2021 at 10:41):

Automatic delaboration of proofs (= converting proof terms into readable tactic scripts) is definitely an interesting topic that deserves research, and it would also be useful independently of the mathlib porting process.
That said, I believe you can divide mathlib proofs into roughly two camps: 1) one-line term mode proofs, 2) long tactic proofs. Even the Lean 3 delaborator is good enough for (1) typically. But (2) uses a large number of fragile tactics: I would expect the Lean 4 versions of ring, etc. to have slightly different behavior. Their behavior has changed often enough already. There will always be a boundary on which proofs can be automatically ported, and I don't think we can shift this boundary very far.
I'm not sure how desirable it is to delaborate Lean 3 code for a whole module (what Rob suggested I believe), given how much of mathlib is automatically generated: there's automatically generated lemmas (equation lemmas, simps), automatic copy&paste (to_additive), and more (alias, ...). Not to speak of comments, etc.

Rob Lewis (Feb 02 2021 at 17:23):

Gabriel Ebner said:

I'm not sure how desirable it is to delaborate Lean 3 code for a whole module (what Rob suggested I believe), given how much of mathlib is automatically generated: there's automatically generated lemmas (equation lemmas, simps), automatic copy&paste (to_additive), and more (alias, ...). Not to speak of comments, etc.

I'm thinking of this as a first pass before manual effort. Suppose you use mathport to make a Lean 4 target.olean from a Lean 3 source.olean. The goal is now to write a Lean 4 target.lean with the same interface as the olean. Delaborating the definitions and theorem statements gives you this, so you know exactly what you need to fill in. You're right that a lot of stuff is created automatically and we'd have to detect this and look for overlaps, but that doesn't sound unreasonable. Things like simps and to_additive leave behind trails that we can automatically detect (or we can add tags if necessary).

Rob Lewis (Feb 02 2021 at 17:24):

Semantic comments (doc strings) should be part of the delaboration, but yeah, other comments can't be. It would also be hard to capture sections, namespaces, variables, etc.

Rob Lewis (Feb 02 2021 at 17:24):

So this doesn't let you ignore source.lean completely.

Rob Lewis (Feb 02 2021 at 17:26):

Actually, namespaces could get in the way here. If things get delaborated in the root namespace I guess all names will be fully specified, which would be ugly.

Sebastian Ullrich (Feb 02 2021 at 17:28):

Now that we can access imported Lean 3 notations in Lean 4, I believe that parsing mathlib as-is in Lean 4 is definitely feasible. Even if it's just 95% of mathlib that can be parsed, that should give you a much better base to build on (i.e. getting the file to actually elaborate) than a delaborated skeleton I think.

Sebastian Ullrich (Feb 02 2021 at 17:32):

Of course this should then be translated to "actual" Lean 4 syntax so mathlib doesn't end up with some weird Lean 4 accent. This part is not quite as clear to me yet because we haven't built enough tooling in the direction of refactorings so far, but it sounds at least as feasible as any delaboration approach to me.

Gabriel Ebner (Feb 02 2021 at 17:32):

@Sebastian Ullrich What do think is the best way to do that? Can we just add macros for the Lean 3 syntax? Is there anything that would be hard to do this way?

Gabriel Ebner (Feb 02 2021 at 17:34):

As far as refactoring goes, I hope that the Lean 4 server gets "quickfix" support at some point. Getting rid of the compat layer should just be a question of hitting ctrl-. maniacally then.

Sebastian Ullrich (Feb 02 2021 at 17:39):

Gabriel Ebner said:

Sebastian Ullrich What do think is the best way to do that? Can we just add macros for the Lean 3 syntax? Is there anything that would be hard to do this way?

For notations the exporter can't handle, user-defined parsers, and tactics (at least their syntax), it really should be as easy as that, yeah. That leaves built-in syntax that is different from Lean 4 as the hardest part. It might be necessary/easier to use the internal Lean.Parser there in favor of the simpler syntax command.

Sebastian Ullrich (Feb 02 2021 at 17:43):

Gabriel Ebner said:

As far as refactoring goes, I hope that the Lean 4 server gets "quickfix" support at some point. Getting rid of the compat layer should just be a question of hitting ctrl-. maniacally then.

I do hope we get there one day :) . But apart from editor integration, the part I wasn't sure about yet is that you either need good tooling for translating syntax while preserving/adapting formatting, or you need a great code formatter you can just run on the unformatted result. We have some initial work on both parts, but there's still lots of work to be done.

Gabriel Ebner (Feb 02 2021 at 17:48):

I don't think you need to worry about formatting for an MVP. Just having the editor integration set up is enough to be very useful. People are ecstatic about "Try this:" in Lean 3, and that's way less sophisticated.

Sebastian Ullrich (Feb 02 2021 at 18:20):

Yeah. I definitely want to avoid a "parse Lean 3 syntax now, translate to Lean 4 syntax later" scenario where the "later" never happens. But if we translate & pretty-print immediately, then fix up the formatting while doing the actual porting, the current pretty printer might not be too far from being usable for that.

Sebastian Ullrich (Feb 02 2021 at 18:26):

To give an idea of what the output currently looks like, this is a pretty-printed part of Prelude.lean: https://github.com/leanprover/lean4/blob/master/tests/lean/Reformat.lean.expected.out
Only comments are preserved, all indentation & line breaks are from the pretty printer

Daniel Selsam (Feb 02 2021 at 23:38):

FYI I have been collecting issues at https://github.com/dselsam/mathport/issues. Currently there are three open requests for backports:

https://github.com/dselsam/mathport/issues/3 don't rely on inductive result types getting unfolded
- after this, we will be able to build the porting tool using lean4:master
https://github.com/dselsam/mathport/issues/7 don't toggle reducibility after-the-fact
- this is the source of many TC failures and also many perf problems (e.g. unfolding real.pi for no reason)
https://github.com/dselsam/mathport/issues/4 make sure all classes are structures (or abbreviations of structures), not definitions
- this causes many TC failures

Are these backports acceptable? Progress on the porting is mostly bottlenecked on the backports, and also on general rising-tide-lifts-all-ships Lean4 features/tactics/ergonomics.

Mario Carneiro (Feb 02 2021 at 23:41):

I'm on the fence about #3. I think there are at least a few examples where we want the constants to have the stated types, and it sounds like Leo's suggestion will make the terms have the wrong type, which will cause issues with typeclass search down the line

Mario Carneiro (Feb 02 2021 at 23:42):

Are there any @[class] def in the lean 3 core library?

Daniel Selsam (Feb 02 2021 at 23:44):

Mario Carneiro said:

Are there any @[class] def in the lean 3 core library?

Not according to grep.

Leonardo de Moura (Feb 03 2021 at 00:05):

Mario Carneiro said:

I'm on the fence about #3. I think there are at least a few examples where we want the constants to have the stated types, and it sounds like Leo's suggestion will make the terms have the wrong type, which will cause issues with typeclass search down the line

The example that Daniel posted on the github thread can be "fixed" by hand.
Note that we only need to unfold definitions in an inductive declaration C in the following two cases :
1- The header is not producing a Type/Prop
2- The constructor type contains an application of the form f ... (C ...) ... and f is a definition.

How often this happens in mathlib? Case 1 seems quite rare. I can see Case 2 occurring in mathlib inductive predicates, but how many do you have?

This feature requires kernel modifications, including support for nested inductives, and paching code that makes assumptions that will not be true anymore after we change the kernel. Daniel already mentioned that many modules broke after he tried to patch the kernel.

We have to prioritize, I have zillions of things to do, and I will not merge kernel hacks to support this.

Mario Carneiro (Feb 03 2021 at 01:01):

In case 1, shouldn't it be a typechecking error? If it's literally not a Type/Prop even after unfolding then it shouldn't be accepted by the kernel. Case 2 does happen, although I'm not sure about the frequency - perhaps Daniel has statistics on this, because my grep skills aren't up to the task atm. Of course we use lots of inductive types (I get 700 hits in mathlib) but most of these are not problematic because they are manifestly pi types targeting Prop/Type. I had an example of this in mind from the QPF project but I don't think it survived the PR process.

A related situation that might come up in less contrived situations is constructors that don't manifestly target the right type, because they have a let in them, for example. I have definitely written code like that in software verification settings, where you have large and complex inductive types, although when it comes time to actually prove properties about the inductive type I usually find it better to have explicit equality assumptions in the constructors.

One way to support this without kernel hacks would be to make the kernel inductive behind the scenes with these things unfolded, and then redeclare the type and constructors with the given types and prove them by defeq. That said, I guess part of the point of adding nested / mutual inductives to the kernel was to avoid this kind of facade pattern, and I agree that this isn't a big priority. I'll do a once-over of mathlib to see if the existing uses can be fixed, and report back if something really critical depends on it.

Leonardo de Moura (Feb 03 2021 at 01:06):

In case 1, shouldn't it be a typechecking error? If it's literally not a Type/Prop even after unfolding then it shouldn't be accepted by the kernel.

In case 1, I meant the type after elaboration is not of the form ... -> Sort u for some u.
In Daniel's example, the type was set (set a) after elaboration. After unfolding the first set, we have set a -> Prop.

Leonardo de Moura (Feb 03 2021 at 01:08):

I have definitely written code like that in software verification settings

We have to prioritize. Let's focus on mathlib only.

Leonardo de Moura (Feb 03 2021 at 01:09):

One way to support this without kernel hacks would be to make the kernel inductive

I know what has to be done. However knowing and doing are two completely different things ;)

Mario Carneiro (Feb 03 2021 at 01:14):

Regarding Daniel's set (set A) example specifically, this is exploiting definitional equality of set A, which is bad style in mathlib anyway (because it messes up the use of \in and {x | p}). So I don't think there will be any problem removing that use.

Leonardo de Moura (Feb 03 2021 at 01:16):

What about this one?

def presieve (X : C) := Π ⦃Y⦄, set (Y ⟶ X)
-- ...
inductive singleton : presieve X
| mk : singleton f

Mario Carneiro (Feb 03 2021 at 01:20):

cc: @Johan Commelin @Bhavik Mehta , how important is the definitional equality of presieve X here?

Mario Carneiro (Feb 03 2021 at 01:20):

I don't work with the category theory library much

Mario Carneiro (Feb 03 2021 at 01:24):

I think I'll just try the facade pattern manually for that one

Bhavik Mehta (Feb 03 2021 at 01:25):

I haven't been following this discussion so I don't really understand the suggested change, but I think as long as there's a coe or other nice notation for presieves and the definition of singleton is via some form of inductive, it should all be fine

Johan Commelin (Feb 03 2021 at 05:30):

I'll defer to Bhavik :wink:

Sebastien Gouezel (Feb 03 2021 at 08:07):

About https://github.com/dselsam/mathport/issues/7, can this be simulated by introducing a new constant and an axiom that it is provably equal to the object we have constructed? Setting the irreducible attribute after the construction is something we use in mathlib for several complicated objects (like the reals, the integral, and so on) whose construction is involved but for which the construction can be forgotten once a basic API has been established. This is something that we will still need to do in Lean4.

Johan Commelin (Feb 03 2021 at 08:18):

I think the proposal is to have something like that: real which is irreducible, and real_implementation where we set up the api.

Gabriel Ebner (Feb 03 2021 at 08:19):

@Sebastien Gouezel We can add a macro for the following:

noncomputable constant pi_core : { x // x = 2 * classical.some exists_cos_eq_zero } := ⟨_, rfl⟩
noncomputable def pi : real := pi_core.1
theorem pi_def : pi = 2 * classical.some exists_cos_eq_zero := pi_core.2

Gabriel Ebner (Feb 03 2021 at 08:32):

macro "irreducible_def" name:ident " : " type:term " := " decl:term : command =>
`(noncomputable constant wrapped : { x : $type // x = $decl } := ⟨_, rfl⟩
noncomputable def $name : real := wrapped.1
protected theorem $(mkIdentFrom name (name.getId ++ `eqn)) : $name = $decl := wrapped.2)

irreducible_def pi : real := 2 * classical.some exists_cos_eq_zero
example : pi = 2 * classical.some exists_cos_eq_zero := pi.eqn

Sebastien Gouezel (Feb 03 2021 at 10:43):

I still need to see how this would work to define, say, $\R$ . Would this lead to DTT hell when constructing the basic API?

Gabriel Ebner (Feb 03 2021 at 10:47):

For ℝ it is easy, we can define constant reals : (α : Type) × conditionally_complete_linear_ordered_field α and then say def ℝ := reals.1. There isn't even any need to expose the implementation here.

Gabriel Ebner (Feb 03 2021 at 10:49):

I'm more worried about wrappers such as with_one (or opposite).

Gabriel Ebner (Feb 03 2021 at 10:49):

Both unop (op X) = X and op (unop X) = X are definitional equalities. Notably, every object of the opposite category is definitionally of the form op X, which greatly simplifies the definition of the structure of the opposite category, for example.

Gabriel Ebner (Feb 03 2021 at 10:51):

Maybe the category theory people can chime in on this. Would a semireducible opposite be enough? @Bhavik Mehta @Reid Barton @Scott Morrison @Bhavik Mehta

Gabriel Ebner (Feb 03 2021 at 10:54):

We can experiment with this today in mathlib. This is essentially the API we'll have in Lean 4 (maybe we need to write irreducible_def instead, but that's just syntax):

-- immediately mark as irreducible
@[irreducible] def foo := 3
-- never use `local attribute [reducible]`, instead use `rw`/`simp`/`unfold`
example : 2*foo = 3 := by rw [foo]; refl

Reid Barton (Feb 03 2021 at 12:50):

Is this different from making opposite a structure?

Gabriel Ebner (Feb 03 2021 at 12:51):

Making it irreducible (and never overriding it locally) is about the same as making it a structure. From what I understand, that would be horrible.

Gabriel Ebner (Feb 03 2021 at 12:51):

The question is if it would be possible to make it semireducible instead of irreducible (i.e. just a plain def).

Reid Barton (Feb 03 2021 at 12:53):

Semireducible was what we had originally and it led to lots of confusion because, if you don't have a clear distinction between A B : C and A B : opposite C, then you don't know whether A \hom B is a map from A to B or a map from B to A.

Reid Barton (Feb 03 2021 at 12:55):

I'm not entirely sure how bad it would be to use a structure; from what I remember, I found the current setup with irreducible provided the most definitional equalities and so we built opposite on top of it. There was never a time that we seriously tried using a structure instead, since there was no need to.

Reid Barton (Feb 03 2021 at 12:56):

Now we have a lot of code using opposite, so we can find out experimentally whether that comment is really correct.

Reid Barton (Feb 03 2021 at 12:59):

For stuff like with_one I think we just use this irreducibility to get a few things about option for free, like the functor/monad instances. I don't think it's a big deal, we can just reimplement them in a few lines (for each of the relevant with_*) manually.

Gabriel Ebner (Feb 03 2021 at 13:01):

I've tried making is_O and friends irreducible in #6021. It's not pretty but not awful either. This was probably the easiest irreducible to remove. @Sebastien Gouezel where would you expect the most DTT hell from making definitions irreducible immediately?

Gabriel Ebner (Feb 03 2021 at 13:05):

additive and multiplicative are another minefield. Maybe it's time to revive #2292 now.

Sebastien Gouezel (Feb 03 2021 at 13:20):

I'd be pretty scared of data/real/basic.lean, but maybe I'm too pessimistic.

Eric Wieser (Feb 03 2021 at 13:25):

Related to docs#additive, I remember a pr about docs#order_dual where I tried to come up with an alternative proof for something to avoid relying on definitional equality, but found all the lemmas I needed were missing

Reid Barton (Feb 03 2021 at 13:38):

BTW, my general preference in these matters is to use structure rather than irreducible, since everyone knows about structure but I'm not sure I even really understand what irreducible does and doesn't do.

Kevin Buzzard (Feb 03 2021 at 14:10):

Yes, I also feel like I don't understand irreducible. I made with_zero and with_one irreducible in #3883 because I was finding them frustrating to work with. People were abusing type equality in code and the resulting "not strictly speaking type correct in some sense" code would cause leaks, where I was trying to prove things about 0 or 1 but was staring at goals which talked about none. I would have been equally happy simply ripping out the definition with_zero X := option X and replacing it with a new inductive type with constructors zero and some x, but at the time people discouraged me from doing this because it would have led to a bunch of code duplication. Right now we steal a bunch of proofs from option when making the with_zero API and then mark it irreducible at the end. I should apologise in advance if this is not relevant, I do not really understand the issues of what is being talked about here.

Daniel Selsam (Feb 03 2021 at 14:36):

but I'm not sure I even really understand what irreducible does and doesn't do.

also, there is no irreducible in Lean4, only constant which is truly opaque (even to the kernel)

Gabriel Ebner (Feb 03 2021 at 14:40):

Sebastien Gouezel said:

I'd be pretty scared of data/real/basic.lean, but maybe I'm too pessimistic.

Ok this was easy: #6024.

Patrick Massot (Feb 03 2021 at 14:54):

The real test will be to completely remove the Cauchy reals and replace them with Bourbaki reals that we also have.

Gabriel Ebner (Feb 03 2021 at 14:58):

Bourbaki = the reals are the unique complete Archimedean ordered field?

Sebastien Gouezel (Feb 03 2021 at 14:59):

The updated definition by Gabriel is the following:

structure real := of_cauchy ::
(cauchy : @cau_seq.completion.Cauchy ℚ _ _ _ abs _)

Can someone explain what this is? What are these ::?

Gabriel Ebner (Feb 03 2021 at 15:00):

The default constructor for a structure is called mk. But real.mk already means something different. The :: allows you to rename the constructor.

Daniel Selsam (Feb 03 2021 at 15:01):

Gabriel Ebner said:

Sebastien Gouezel said:

I'd be pretty scared of data/real/basic.lean, but maybe I'm too pessimistic.

Ok this was easy: #6024.

Great! I still see irreducible though -- is it possible to wrap lt in a constant?

Gabriel Ebner (Feb 03 2021 at 15:01):

Not in Lean 3. I don't want to add hundreds of noncomputable annotations.

Reid Barton (Feb 03 2021 at 15:02):

lt is a Prop though, why would it matter?

Gabriel Ebner (Feb 03 2021 at 15:02):

The idea here is to only use irreducible in such a way that we can simulate in Lean 4. For example using the irreducible_def macro I posted above.

Gabriel Ebner (Feb 03 2021 at 15:03):

Ah then it might be possible. But I don't see why it matters. We don't want to add hacks now just to remove them again in a few months.

Sebastien Gouezel (Feb 03 2021 at 15:03):

In which sense is this irreducible, though? If Lean sees an equality x + y = z and wants to check it definitionally, it will never unfold by itself the definition of addition on the fields of the structure to check if they match after this unfolding, right?

Sebastien Gouezel (Feb 03 2021 at 15:05):

Also, #6024 doesn't build ;-)

Gabriel Ebner (Feb 03 2021 at 15:05):

Oops, so + is irreducible now, and I accidentally made it semireducible.

Daniel Selsam (Feb 03 2021 at 15:10):

Gabriel Ebner said:

The idea here is to only use irreducible in such a way that we can simulate in Lean 4. For example using the irreducible_def macro I posted above.

If I am reading https://github.com/leanprover-community/mathlib/blob/f8dadb5c855646f5a96e3551737801dbafb4fbd3/src/data/real/basic.lean#L87-L98 correctly, it looks like you are manually unfolding lt after-the-fact. So if the auto-porter used your macro to define lt, the downstream proof would break.

Reid Barton (Feb 03 2021 at 15:12):

I'm going to experiment with opposite-as-structure later today.

Gabriel Ebner (Feb 03 2021 at 15:12):

simp [real.lt] uses the equation lemma. The macro also produces an equation lemma real.lt.eqn.

Gabriel Ebner (Feb 03 2021 at 15:12):

There should be no definitional unfolding happening.

Daniel Selsam (Feb 03 2021 at 15:33):

I may be missing something obvious, but your macro doesn't seem to guarantee that the equation lemma of the constant and the equation lemma(s) of the original definition line up. How about this:

when mathport reaches a definition, it looks ahead to see if it is immediately declared irreducible
if so, it collects all equation lemmas, and creates a constant { x // all equation lemmas hold }
it then creates a def for (new constant).val and one for each of the eqn lemmas (new constant).property[...]
it then aligns the old decl and equation lemmas with the new ones

Daniel Selsam (Feb 03 2021 at 15:34):

FYI I will be offline today but back tomorrow.

Gabriel Ebner (Feb 03 2021 at 15:43):

There's typically only a single equation for irreducible definitions, but yes, that sounds good. The macro was intended as an illustration for "this is how it will be after we've ported mathlib", not a suggestion for the porting tool.

Patrick Massot (Feb 03 2021 at 17:16):

Gabriel Ebner said:

Bourbaki = the reals are the unique complete Archimedean ordered field?

No, that wouldn't be a construction but a characterization. Bourbaki constructs real number as the completion of the topological ring of rational number, using a lot of the topology library.

Gabriel Ebner (Feb 03 2021 at 17:25):

Is this what Johannes originally did in mathlib? I believe Mario removed that construction in favor of the current elementary one to reduce dependencies.

Patrick Massot (Feb 03 2021 at 17:36):

Johannes originally did a somewhat hybrid version, not going full Bourbaki. Mario added an elementary construction because there were huge performance issues. This was before we started to mark stuff irreducible.

Johan Commelin (Feb 03 2021 at 17:47):

I agree that it is a bit ridiculous that the definition of real numbers in Lean is understandable by freshmen students... clearly we should take inspiration from integration theory do things properly. :rofl: :rofl: [/trolling]

Daniel Selsam (Feb 12 2021 at 02:22):

Updates and issues:

I back-ported a minor adjustment to dite and ite but haven't PRed yet (not sure what the protocol is for multi-repo changes):
- https://github.com/dselsam/lean3/tree/dite
- https://github.com/dselsam/mathlib/tree/dite
Lean4's < doesn't match lean3's < on nats. In lean3, nat.lt and nat.le are inductive whereas Nat.lt and Nat.le are defined as the boolean versions returning true. I think this instance clash will cause big problems and needs to be fixed.
The number-mismatch is more problematic than I realized at first. Lean3 uses zero, one, bit0 and bit1, whereas Lean4 uses OfNat α n. It is not trivial to auto-convert during the porting. I tried making OfNat instances from has_zero, has_one, and has_add, but it relies on a recursive function with idealized type Π [has_one α] [has_add α], ℕ → α. For easy auto-porting, this function must permit reasonably efficient reduction in the kernel. Thus, it cannot be well-founded or partial. But if it relies on fuel, it becomes impossibly expensive to confirm 2 + 2 = 4 in the kernel.
To align applicative and monad, all that needs to happen is that seq_left and seq_right must require the discarded action to return unit. It is easy to change, but there is some theory based on lawful_applicative and lawful_monad that broke, so ideally somebody who works on that part of the library can finish propagating the change.
The current lack of string->String alignment is not catastrophic but it is unpleasant, since e.g. auto_params will show as lists of nats. A few issues are blocking this alignment.
1. fin is a def in lean3 whereas Fin is a structure in lean4. Note that it is important for it to be a structure in lean4 because there is special support in the system for it in various places, and so Fin needs to be the whnf.
2. unsigned is a def in lean3 whereas UInt32 is a structure in lean4 for the same reason as Fin
3. The nat.lt issue also complicates this.
4. The general number mismatch also complicates this.

Daniel Selsam (Feb 12 2021 at 02:26):

Are there any updates on the two open backports?

- https://github.com/dselsam/mathport/issues/7
- https://github.com/dselsam/mathport/issues/4

Note that I switched https://github.com/dselsam/mathport/issues/3 from backport to blocked, pending a final decision about what to support on the lean4 side.

Mario Carneiro (Feb 12 2021 at 02:30):

Regarding dselsam/mathport#4, there were two PRs handling this; #6028 was merged and covered most of the occurrences I could find with a quick grep, and #6061 added a linter and removed all remaining occurrences. However #6061 has not been merged as there is some mysterious typeclass breakage in an unrelated area

Mario Carneiro (Feb 12 2021 at 02:34):

As for things like Nat.lt vs nat.lt, as I mentioned to you before we have to find a way to handle non-defeq replacements, defeq is setting way too high a bar. For propositions it's feasible that we can prove an equality and simp it away in lemmas

Mario Carneiro (Feb 12 2021 at 02:35):

I think backporting Nat.lt is out of the question, that would be way too disruptive

Mario Carneiro (Feb 12 2021 at 02:38):

Also, a < b := a + 1 <= b is super convenient and used all the time. Could we maybe have a conversation about why these were changed and whether we need to be doing something different with decidable predicates in general?

Bryan Gin-ge Chen (Feb 12 2021 at 02:40):

For "multi-repo changes", we have been first merging changes to Lean 3, then releasing a new version of Lean 3, and then finally making the necessary changes to mathlib in a PR that bumps the Lean version.

Bryan Gin-ge Chen (Feb 12 2021 at 03:00):

Regarding fin, we changed it from a structure to a def because of some inconveniences expressed in lean#359. @Johan Commelin can probably say more.

Daniel Selsam (Feb 12 2021 at 03:02):

nat.lt would need to be aligned for a def-eq alignment of string to String, but otherwise wouldn't need to be. The issue is the clashing instances. Would you just want to ignore the Lean4 instance and have m < n mean the Mathlib nat.lt? This may be problematic, since Lean4 may internally synthesize decidable instances of things at various times and now may end up with ones that don't use the Nat primitives that are builtin to the kernel, e.g. Nat.blt.

Mario Carneiro (Feb 12 2021 at 03:04):

My first choice would be to get Nat.lt to have the same definition as in mathlib, with a decidable instance following the lean 4 style

Mario Carneiro (Feb 12 2021 at 03:04):

I suspect the current definition of Nat.lt was chosen because it's the easiest definition that adheres with Nat.blt

Mario Carneiro (Feb 12 2021 at 03:05):

but we should be able to prove the equivalence of the two definitions and use this equivalence in the implementation of the decidable instance for Nat.lt

Mario Carneiro (Feb 12 2021 at 03:07):

I don't think it matters as much whether Nat.le retains the lean 3 definition vs lean 4, because the underlying inductive type nat.less_than_or_equal is an internal implementation detail that is used extremely rarely in mathlib

Mario Carneiro (Feb 12 2021 at 03:12):

Re: applicative and monad, could you put your work on a branch and/or PR what you have?

Daniel Selsam (Feb 12 2021 at 03:29):

Mario Carneiro said:

Re: applicative and monad, could you put your work on a branch and/or PR what you have?

Here is the basic change that needs to be propagated: https://github.com/dselsam/lean3/commit/7e8ead2c0d371ddacef7db30e7dd68d52f4b768f Note that this change breaks within the lean3 repo.

Daniel Selsam (Feb 12 2021 at 03:49):

Mario Carneiro said:

Also, a < b := a + 1 <= b is super convenient and used all the time. Could we maybe have a conversation about why these were changed

Nat.lt is also defined that way in Lean4: https://github.com/leanprover/lean4/blob/master/src/Init/Prelude.lean#L536-L537

Daniel Selsam (Feb 12 2021 at 03:50):

@Mario Carneiro

I think backporting Nat.lt is out of the question, that would be way too disruptive

I don't think it matters as much whether Nat.le retains the lean 3 definition vs lean 4, because the underlying inductive type nat.less_than_or_equal is an internal implementation detail that is used extremely rarely in mathlib

Can you please clarify, what would be way too disruptive and what doesn't matter that much?

Daniel Selsam (Feb 12 2021 at 03:57):

Mario Carneiro said:

but we should be able to prove the equivalence of the two definitions and use this equivalence in the implementation of the decidable instance for Nat.lt

Meaning something like this?

axiom Nat.le2 : Nat → Nat → Prop
axiom Nat.leOfTrue  {m n : Nat} : Nat.ble m n = true     → Nat.le2 m n
axiom Nat.leOfFalse {m n : Nat} : ¬ (Nat.ble m n = true) → ¬ Nat.le2 m n

instance (m n : Nat) : Decidable (Nat.le2 m n) :=
  if h : Nat.ble m n = true then isTrue (Nat.leOfTrue h)
  else isFalse (Nat.leOfFalse h)

Mario Carneiro (Feb 12 2021 at 04:34):

Can you please clarify, what would be way too disruptive and what doesn't matter that much?

Changing the definition of nat.lt in lean 3 to make it not validate a < b =?= a + 1 <= b would break lots and lots of things. Changing the definition of nat.le would not break much because nat.less_than_or_equal is not (or should not be) used directly in proofs

Mario Carneiro (Feb 12 2021 at 04:36):

Meaning something like this?

Yes. We should be able to slot in any convenient definition into Nat.le2, as long as those two theorems Nat.leOfTrue and Nat.leOfFalse are provable, and we should see no performance penalty

Mario Carneiro (Feb 12 2021 at 04:38):

On the assumption that lean 4 isn't doing much with proofs about Nat.le, it would be relatively easier to forward port nat.less_than_or_equal to lean 4 than backport Nat.ble a b = true to lean 3, which will break the foundational theorems plus the handful of mathlib theorems that use this inductive definition

Daniel Selsam (Feb 12 2021 at 04:42):

Changing the definition of nat.lt in lean 3 to make it not validate a < b =?= a + 1 <= b would break lots and lots of things

To reiterate, that is how Nat.lt is defined in Lean4 as well so this would not be an issue.

Mario Carneiro (Feb 12 2021 at 04:42):

right

Daniel Selsam (Feb 12 2021 at 04:46):

Mario Carneiro said:

On the assumption that lean 4 isn't doing much with proofs about Nat.le, it would be relatively easier to forward port nat.less_than_or_equal to lean 4 than backport Nat.ble a b = true to lean 3, which will break the foundational theorems plus the handful of mathlib theorems that use this inductive definition

I can't assess how hard it would be to backport, but I don't see an issue with forward-porting https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/mathport/near/226095866 Leo may not want the extra complexity though.

Mario Carneiro (Feb 12 2021 at 04:46):

Regarding seq_left, can we just not align it? Or have a separate typeclass for the notation? It seems like it's just a pessimization. I recall @Sebastian Ullrich had a rationale for this change but I forget what it was and whether it applies to the *> notation as separate from do notation

Mario Carneiro (Feb 12 2021 at 04:49):

Plus do notation has been extended so radically in lean 4 since the change was first made that perhaps the rationale doesn't apply anymore

Daniel Selsam (Feb 12 2021 at 04:50):

Mario Carneiro said:

Regarding seq_left

It is not necessary to align it. The downside is that there will be two of many different classes, and the category-theoretic monad won't be connected to the lean Monad
If you wanted, you could make the constructors align but then (still in lean3) use the notation for a definition of the original type (that adds a discard). You would still need to update the lawful types and patch a few proofs though

Daniel Selsam (Feb 12 2021 at 04:52):

I don't remember the rationale either :)

Mario Carneiro (Feb 12 2021 at 04:52):

I'm hoping we can align Monad and monad with lean 3 seq_left being a separate thing

Mario Carneiro (Feb 12 2021 at 04:53):

perhaps a *> b is defined over applicatives as Monad.seq_left (discard a) b

Mario Carneiro (Feb 12 2021 at 04:54):

with Monad.seq_left having no particular notation

Daniel Selsam (Feb 12 2021 at 04:54):

Mario Carneiro said:

perhaps a *> b is defined over applicatives as Monad.seq_left (discard a) b

I think that is what I was suggesting above.

If you wanted, you could make the constructors align but then (still in lean3) use the notation for a definition of the original type (that adds a discard). You would still need to update the lawful types and patch a few proofs though

Mario Carneiro (Feb 12 2021 at 04:55):

aha ok

Mario Carneiro (Feb 12 2021 at 04:56):

even though seq_left is a field of the monad structure I don't think anyone ever gives them a non-default implementation

Daniel Selsam (Feb 12 2021 at 04:57):

Even without that trick, I think adding the discards manually is just a few minutes with egrep and not a big issue. The somewhat harder part is that the lawful type needs to change and a few proofs need to be patched.

Mario Carneiro (Feb 12 2021 at 04:57):

I think it's probably not hard but it's going to make mathlib worse. It's not a desirable change

Daniel Selsam (Feb 12 2021 at 05:03):

Hold off then until you can discuss with Sebastian.

Daniel Selsam (Feb 12 2021 at 05:03):

Do you have thoughts about the number-mismatch?

Mario Carneiro (Feb 12 2021 at 05:08):

I'm not sure I understand the way you set up the problem. What is this function Π [has_one α] [has_add α], ℕ → α that you need? That looks like OfNat and it's already got an efficient implementation in lean 4

Mario Carneiro (Feb 12 2021 at 05:09):

To interpret lean 3 numeral expressions you just need bit0 and bit1 and the corresponding functions n + n and n+n+1 are passably fast in the lean 4 kernel

Daniel Selsam (Feb 12 2021 at 05:25):

I meant something like this:

class HasZero (α : Type u) := (zero : α)
class HasOne  (α : Type u) := (one : α)

universes u
variable {α : Type u} [HasZero α] [HasOne α] [Add α]

def bit0 (x : α) : α := x + x
def bit1 (x : α) : α := bit0 x + HasOne.one

instance [HasOne α] : Inhabited α := ⟨HasOne.one⟩

def nat2bitsFuelAux (fuel n : Nat) : α :=
  match fuel with
  | 0        => arbitrary
  | fuel + 1 =>
    if n == 0 then arbitrary
    else if n == 1 then HasOne.one
    else if n % 2 == 1 then bit1 (nat2bitsFuelAux fuel (n / 2))
    else bit0 (nat2bitsFuelAux fuel (n / 2))

-- note: partial and wf don't compute in the kernel
def nat2bitsFuel (n : Nat) : α := nat2bitsFuelAux 10 n

-- already takes a while, and 10 fuel is not enough for all of mathlib (real.pi has big numbers)
example : (nat2bitsFuel 2 : α) + (nat2bitsFuel 2 : α) = nat2bitsFuel 4 := rfl

Daniel Selsam (Feb 12 2021 at 05:28):

I tried replacing bit0/bit1 expressions with variations of OfNat.ofNat <type> <nat> <instNat2bitsInAlpha> which are definitionally equal and fine for isolated numbers, but basic arithmetic (e.g. 2 + 2 =?= 4) is intractable for the kernel. This comes up in e.g. zero_lt_four.

Mario Carneiro (Feb 12 2021 at 05:31):

So I'm thinking more like this:

class HasZero (α : Type u) := (zero : α)
class HasOne  (α : Type u) := (one : α)

universes u
variable {α : Type u} [HasZero α] [HasOne α] [Add α]

def bit0 (x : α) : α := x + x
def bit1 (x : α) : α := bit0 x + HasOne.one

instance instZero2Nat : OfNat α (noindex! 0) := ⟨HasZero.zero⟩
instance instOne2Nat  : OfNat α (noindex! 1) := ⟨HasOne.one⟩

example : (bit0 1 : α) + (bit0 1 : α) = (bit0 (bit0 1) : α) := rfl

Daniel Selsam (Feb 12 2021 at 05:32):

But now in Lean4, you have bit0 and bit1 everywhere instead of numbers.

Mario Carneiro (Feb 12 2021 at 05:33):

I would like to make

example : (2 : α) + (2 : α) = (4 : α) := rfl

work, but this requires instances of OfNat A 2 and OfNat A 4 and it's not clear to me how lean 4 expects me to build a countable infinity of typeclasses

Mario Carneiro (Feb 12 2021 at 05:34):

I suspect this might be a design flaw re: OfNat instances. It's not necessarily an option to inhabit OfNat A n directly, if all you have to start from is a ring, say

Mario Carneiro (Feb 12 2021 at 05:35):

If instances can be constructed by binary recursion then that might work, but what is the structure of lean 4 nat literals that can be pattern matched on?

Daniel Selsam (Feb 12 2021 at 05:35):

This works in principle (and has kernel def-eq) but it is just too slow: https://github.com/dselsam/mathport/blob/strings2/Lib4/PrePort/Number.lean#L36-L50

Mario Carneiro (Feb 12 2021 at 05:36):

No I mean that the instance should depend on the number

Daniel Selsam (Feb 12 2021 at 05:38):

I don't follow, but I need to sign off now. To be continued.

Daniel Selsam (Feb 12 2021 at 05:38):

Thanks for your help.

Mario Carneiro (Feb 12 2021 at 05:42):

instance : HasOne Nat := ⟨1⟩

instance instZero2Nat : OfNat α (noindex! 0) := ⟨HasZero.zero⟩
instance instOne2Nat  : OfNat α (noindex! 1) := ⟨HasOne.one⟩
instance instBit0Nat [OfNat α n] : OfNat α (bit0 n) := ⟨bit0 (OfNat.ofNat n)⟩
instance instBit1Nat [OfNat α n] [HasOne α] : OfNat α (bit1 n) := ⟨bit1 (OfNat.ofNat n)⟩

example : OfNat α 37 :=
(inferInstance : OfNat α (bit1 $ bit0 $ bit1 $ bit0 $ bit0 1))

Mario Carneiro (Feb 12 2021 at 05:42):

I want the inferInstance line to actually be constructed by typeclass inference

Mario Carneiro (Feb 12 2021 at 05:43):

There might be a way to get unification hints to do this?

Mario Carneiro (Feb 12 2021 at 05:45):

Well, maybe it suffices for mathport to do its own fake typeclass inference here, although using OfNat with such an instance is barely better than the original approach of writing bit0/bit1 chains

Johan Commelin (Feb 12 2021 at 06:07):

Bryan Gin-ge Chen said:

Regarding fin, we changed it from a structure to a def because of some inconveniences expressed in lean#359. Johan Commelin can probably say more.

I refactor fin to be a subtype because there is quite a bunch of API for subtypes and I didn't see the point in duplicating all of it for fin.

Daniel Selsam (Feb 12 2021 at 17:29):

Mario Carneiro said:

using OfNat with such an instance is barely better than the original approach of writing bit0/bit1 chains

I think it is essential that we can write e.g. 2 instead of bit0 1, at least when there is a natural ∀ n, OfNat α n instance, e.g. nats, rationals, reals

Daniel Selsam (Feb 12 2021 at 17:39):

It wouldn't be so bad for Mathlib4 and Lean4 to have different definitions of Fin. I would like to align strings though to avoid the auto_param garbage prints. I see two other options (besides backporting Fin-as-structure):

Since string is almost never used in actual math, perhaps we can just create a Lean4-friendly Fin type fin4 and have characters use that instead of the regular fin.
Or we could just not port anything that depends on the def-eq details of characters that would be changed by auto-aligning.

Daniel Selsam (Feb 12 2021 at 17:56):

Mario Carneiro said:

There might be a way to get unification hints to do this?

I don' think this is supported. The following

unif_hint (m n : Nat) where (m / 2) =?= n |- m =?= bit0 n

is not accepted because ?m =?= bit0 (?m / 2) does not hold definitionally.

Mario Carneiro (Feb 12 2021 at 19:04):

Alternatively, we could get unification to somehow "directly" support n+k patterns, i.e. n*2+3 =?= 17 matches n =?= 7. Without something like this, the whole idea of typeclasses reflecting on syntax interacts badly with the unstructured space of lean 4 nat literals; there's no way to write a finite set of instances that covers all natural numbers by induction

Mario Carneiro (Feb 12 2021 at 19:06):

If typeclass inference can call arbitrary tactics, that would be another way to support such a thing. I'm not sure where the design landed on that point. @Sebastian Ullrich Do you know?

Mario Carneiro (Feb 12 2021 at 19:10):

That unification hint isn't quite right, it needs to be specialized to numerals, i.e.

unif_hint (n : Nat) where 3 =?= n |- 6 =?= bit0 n
unif_hint (n : Nat) where 4 =?= n |- 8 =?= bit0 n
unif_hint (n : Nat) where 5 =?= n |- 10 =?= bit0 n
...

I think you can see why this needs special elaborator support. Nat literals have special kernel support and so other aspects have to be similarly lifted to compensate

Sebastian Ullrich (Feb 14 2021 at 14:38):

Mario Carneiro said:

If typeclass inference can call arbitrary tactics, that would be another way to support such a thing. I'm not sure where the design landed on that point. Sebastian Ullrich Do you know?

There are no plans to do that anymore. Making the two systems mutually dependent would be insanity in the implementation, both on the term and (monad) type level. Having typeclass resolution and unification be interdependent is already complex enough.

Mario Carneiro (Feb 14 2021 at 22:51):

What do you mean by mutually dependent? What I am (coming around to) suggesting is more like a dynamically created typeclass instance, which is triggered on some syntactic condition that fits into the discrimination tree like prime (ofNat n) and calls a procedure that determines if this "instance" applies or not and returns a proof if so. It interacts minimally with the rest of the system, it's not delayed but is tried in order with other instances. These extensions can also live in the same monad as whatever typeclasses normally see, it doesn't have to be a full blown TacticM.

Sebastian Ullrich (Feb 15 2021 at 08:57):

Ok, that's a pretty different proposal

Daniel Selsam (Feb 15 2021 at 15:40):

@Mario Carneiro are you proposing something like this?

structure InstanceGenerator (α : Type) where
  generate : Expr → MetaM (Option Expr) /- maybe just `Option Expr` -/

@[instanceGenerator]
def generateOfNat {α : Type} [One α] [Add α] (n : Nat) : InstanceGenerator (OfNat α n) := {
  generate goal = do
    -- if n is concrete, do the recursion here to produce the instance in terms of bit0/bit1
    sorry
}

And then this thing is inserted into the DTree for the α, and tried like any other instance, except the custom generate code is called instead of the normal resolution step?

Mario Carneiro (Feb 15 2021 at 16:59):

Uh, kind of, although as a practical matter I think that mixing the two meta-levels in one statement will cause problems (you can't call the generator code unless you have an actual compiled type alpha with one, add, n). But you can probably hide the details behind a macro. Probably something like this:

structure InstanceGenerator (α : Type) where
  trigger : Expr
  generate : Expr → MetaM (Option Expr)

@[instanceGenerator]
def generateOfNat : InstanceGenerator where
  trigger = `({α : Type} [One α] [Add α] (n : Nat) -> OfNat α n)
  generate goal = do
    -- if n is concrete, do the recursion here to produce the instance in terms of bit0/bit1
    sorry

I think the generate function should be in a monad, working with exprs without the elaborator functions is pretty hard, but it's fine if the determinism is expressed as a constraint to programmers, i.e. "we will cache the result of this instance so you might get incorrect results if your generator tactic depends on the phase of the moon"

Daniel Selsam (Feb 15 2021 at 17:05):

Mario Carneiro said:

I think the generate function should be in a monad, working with exprs without the elaborator functions is pretty hard

I don't think TermElabM will be on the table -- this proposal only seems plausible in the current design if the monad can easily lift to SynthM, which basically means it must be MetaM. What operations specifically do you think you will need to perform?

Mario Carneiro (Feb 15 2021 at 17:11):

I'm not entirely sure, I haven't written enough lean 4 tactic code to say. In lean 3 it was certainly the case that life is a lot easier if you can use to_expr to build terms

Mario Carneiro (Feb 15 2021 at 17:12):

or mk_app

Mario Carneiro (Feb 15 2021 at 17:13):

Maybe MetaM is fine. (Maybe SynthM? What's in there?)

Daniel Selsam (Feb 15 2021 at 17:14):

The app-builder is in MetaM. SynthM is just some tabled-resolution metadata on top of MetaM and won't be helpful for this purpose.

Mario Carneiro (Feb 15 2021 at 17:15):

I mention that only because it might be useful to see that metadata to prevent looping

Mario Carneiro (Feb 15 2021 at 17:16):

Most leaf node instances won't need it

Daniel Selsam (Feb 15 2021 at 17:17):

Tabled resolution can still do its loop-prevention magic even with these instance generators. Tabling is orthogonal to how the "generator" nodes actually generate the subgoals. (aside: one of my SearchT extensions implements a very generic tabling algorithm)

Mario Carneiro (Feb 15 2021 at 17:20):

MetaM sounds good to me then

Daniel Selsam (Feb 15 2021 at 17:27):

Mario Carneiro said:

...
trigger = `({α : Type} [One α] [Add α] (n : Nat) -> OfNat α n)
...

Aside: I think it should just be the trigger itself, e.g. OfNat ?α ?n since that is the only thing a generator has to go on -- as long as you are in MetaM, you can then try synthesizing One ?α and Add ?α yourself.

EDIT: actually that would be insane since it would create separate SynthM invocations. You would need the original SynthM to solve the TC subgoals, and then trigger the generator function and pass it the instances. I don't have time right now to think about it carefully.

Daniel Selsam (Feb 15 2021 at 17:30):

FYI I linked to this thread from https://github.com/dselsam/mathport/issues/17

Mario Carneiro (Feb 15 2021 at 17:34):

Yeah it can just be whatever data structure the SynthM is working on

Mario Carneiro (Feb 15 2021 at 17:36):

although the postprocessing to put this into a usable form in the discrimination tree on @[instanceGenerator] has room to manipulate the term before entering it; the notation `({α : Type} [One α] [Add α] (n : Nat) -> OfNat α n) is mostly to make it easy for users to specify the signature

Mario Carneiro (Feb 15 2021 at 17:38):

and as I alluded to above, this can all be hidden behind a macro so that you are writing something more like your earlier version:

instanceGenerator generateOfNat {α : Type} [One α] [Add α] (n : Nat) : OfNat α n where
  generate goal = do
    -- if n is concrete, do the recursion here to produce the instance in terms of bit0/bit1
    sorry

Daniel Selsam (Feb 15 2021 at 17:39):

See my EDIT above: https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/mathport/near/226410863

Mario Carneiro (Feb 15 2021 at 17:40):

That's sort of what I meant by preventing loops, if you restart the instance context you can get into trouble

Mario Carneiro (Feb 15 2021 at 17:41):

I think that means that we need a version of mkAppM in SynthM that does typeclass inference in that monad instead of firing up a new one

Daniel Selsam (Feb 15 2021 at 17:44):

Currently in TC, at each node the generated instance is constructed before the subgoals are solved, in terms of the metavariables representing the subgoals. You might need to generalize this part, so that your generate function is executed eagerly on these arguments instead of just being applied to them (and so can only inspect non-metavariable goals, e.g. n in the OfNat example)

Mario Carneiro (Feb 15 2021 at 17:46):

In this example, the fact that the signature is {α : Type} [One α] [Add α] (n : Nat) : OfNat α n means that the ambient TC process is responsible for supplying values of One and Add. The generator only has to create an instance once these are known

Mario Carneiro (Feb 15 2021 at 17:47):

I suppose it's possible to hand these subgoals to the generator and require it to call some function to continue TC inference on those subgoals, but that seems like eliminable boilerplate unless the generator somehow wants to control the timing of this resolution

Mario Carneiro (Feb 15 2021 at 17:49):

Also you somehow need to signal what of the goal needs to be concrete, for example n has to not be a metavariable when the generator is called, or at least it needs to be considered for calling again when n is resolved

Mario Carneiro (Feb 15 2021 at 17:50):

possibly the usual rules about TC outParams and such are sufficient

Daniel Selsam (Feb 15 2021 at 17:55):

Maybe all you need is a way to register an Expr -> MetaM Expr postprocessor for a regular instance:

instance instBits2Nat {α : Type} [One α] [Add α] (n : Nat) : OfNat α n := ...
@[postprocessFor instBits2Nat] def expandBits2Nat (instVal : Expr) : MetaM (Option Expr) :=
    match instVal with
    | (instBits2Nat _ one add n) => /- recurse on n, build bit0/bit1 term -/

Then this postprocess function could be called here: https://github.com/leanprover/lean4/blob/master/src/Lean/Meta/SynthInstance.lean#L319

Mario Carneiro (Feb 15 2021 at 17:57):

What would go in the ...? sorry?

Mario Carneiro (Feb 15 2021 at 17:58):

Because I'm considering in particular the case where the instance is allowed to fail for some values of its arguments, like my earlier example of instance generatePrime : prime n := ...

Daniel Selsam (Feb 15 2021 at 17:59):

There should be some front-end way of not needing to provide that value at all, e.g. with your InstanceGenerator type that takes a target and a post-process function. The point of the comment was that it shouldn't be generate, it should be postprocess and it should see the metavariables corresponding to the TC subgoals.

Mario Carneiro (Feb 15 2021 at 18:00):

It wouldn't be the end of the world to use sorry there, since as you say it's going to be postprocessed away

Mario Carneiro (Feb 15 2021 at 18:00):

although if the instance generator fails during postprocessing you might be in deep trouble

Daniel Selsam (Feb 15 2021 at 18:02):

Syntactically I would probably prefer:

@[instanceGenerator]
def expandBits2Nat {α : Type} [One α] [Add α] (n : Nat) : InstanceGenerator (OfNat α n) := {
  postProcess (instVal : Expr) : MetaM (Option Expr) := ...
}

Daniel Selsam (Feb 15 2021 at 18:02):

Mario Carneiro said:

although if the instance generator fails during postprocessing you might be in deep trouble

I changed it to MetaM (Option Expr) in the example.

Mario Carneiro (Feb 15 2021 at 18:03):

I suppose that can be transformed to what I mentioned, if attribute macros have that much power, but that's quite a lie of a type signature

Mario Carneiro (Feb 15 2021 at 18:04):

when is the postprocessor called? I meant that if it fails you still want to be able to try the next instance

Daniel Selsam (Feb 15 2021 at 18:05):

Here: https://github.com/leanprover/lean4/blob/master/src/Lean/Meta/SynthInstance.lean#L319
You just return none if it fails, and it is as if the types didn't unify in a normal TC generation step.

Daniel Selsam (Feb 17 2021 at 23:56):

@Mario Carneiro Here is a lightweight implementation of the instance-post-processor proposal: https://github.com/dselsam/lean4/commit/c3c74f8ac273e93dfd6340cf94bf0cfb1f867286 Can you please play around with it and see if it supports all the use-cases you had in mind?

Mario Carneiro (Feb 18 2021 at 00:11):

Well, I don't know how to compile lean 4 yet but I can guess based on the code. Is the nat2Bits function required in this implementation? I think I would rather have instBits2Nat be an axiom or, even better, something that doesn't type check, so that if the postprocessor fails to do its job then an error is triggered

Daniel Selsam (Feb 18 2021 at 00:49):

It can be a constant as well:

@[instance] constant instBits2Nat {α : Type u} [Zero α] [One α] [Add α] (n : Nat) : OfNat α n

Daniel Selsam (Feb 18 2021 at 00:50):

We can nail down ergonomics later though -- the more salient question is whether the capability is adequate.

Mario Carneiro (Feb 18 2021 at 01:40):

I think so. What happens to the "foo" error message if it fails? It would be nice for this to show up in the instance trace, but not essential

Daniel Selsam (Feb 18 2021 at 01:51):

Good question. In the version I pushed, synth does not catch that exception, so you just see e.g. /home/dselsam/omega/lean4/tests/lean/instancePostProcessor.lean:44:8: error: foo. That is a programmer error though -- a post-processor can fail gracefully by returning pure none.

Mario Carneiro (Feb 18 2021 at 02:55):

There was an example that came up recently about fin numerals which are appropriately bounded. I haven't tested this but hopefully this works:

import Lean
open Lean Lean.Meta

structure Fin2 (n : Nat) where
  val  : Nat
  isLt : Nat.ble n val = false

def Fin2.OfNat (i n : Nat) (h : Nat.ble n i = false) : OfNat (Fin2 n) i := ⟨⟨i, h⟩⟩

@[instance] axiom finOfNatFake (i n : Nat) : OfNat (Fin2 n) i

@[instancePostProcessor finOfNatFake]
def finOfNatPostProcessor : Lean.Meta.InstancePostProcessor where
  apply := λ e => do
    let #[ei, en] ← pure e.getAppArgs | throwError "foo"
    let some i ← pure (← instantiateMVars (← whnf ei)).natLit? | pure none
    let some n ← pure (← instantiateMVars (← whnf en)).natLit? | pure none
    if i ≥ n then return none
    pure $ some $ mkApp3 (mkConst `Fin.OfNat) ei en $
      mkApp2 (mkConst ``Eq.refl [levelOne]) (mkConst ``Bool) (toExpr false)

#check (0 : Fin2 3) -- works
#check (1 : Fin2 3) -- works
#check (2 : Fin2 3) -- works
#check (3 : Fin2 3) -- fails
#check (4 : Fin2 3) -- fails

(I'm redeclaring Fin so that the existing instance doesn't trigger as fallback.)

Daniel Selsam (Feb 18 2021 at 03:17):

Your example exposes an issue :) Currently when no post-processor succeeds, the axiom finOfNatFake remains in the proof. Your example has a few small errors so FYI here is a version that runs:

import Lean
open Lean Lean.Meta

structure Fin2 (n : Nat) where
  val  : Nat
  isLt : Nat.ble n val = false

def Fin2.OfNat (i n : Nat) (h : Nat.ble n i = false) : OfNat (Fin2 n) i := ⟨⟨i, h⟩⟩

@[instance] axiom finOfNatFake (i n : Nat) : OfNat (Fin2 n) i

@[instancePostProcessor finOfNatFake]
def finOfNatPostProcessor : Lean.Meta.InstancePostProcessor where
  apply := λ e => do
    let #[ei, en] ← (e.getAppArgs.mapM λ arg => do
      let arg ← instantiateMVars arg
      let arg' ← whnfD arg
      pure arg') | throwError "foo"
    match (ei.natLit?, en.natLit?) with
    | (some i, some n) =>
      if i ≥ n then pure none else
       pure $ some $ mkApp3 (mkConst `Fin2.OfNat) ei en $
         mkApp2 (mkConst ``Eq.refl [levelOne]) (mkConst ``Bool) (toExpr false)
    | other =>
      pure none

set_option pp.all true
#check (2 : Fin2 3) -- works
#check (3 : Fin2 3) -- whoops, works with `finOfNatFake`

Daniel Selsam (Feb 18 2021 at 03:18):

Maybe if there are post-processors for an instance, then it is a failure if none of them work

Mario Carneiro (Feb 18 2021 at 03:20):

Yeah, I don't think it's a problem if there is only one postprocessor per instance, and it overrides whatever match status you get from the original. That makes more sense with the original model anyway

Mario Carneiro (Feb 18 2021 at 03:21):

In fact, I think it is dangerous to allow multiple postprocessors on one instance, because if the first one modifies the expression and the second one expects it to be a constant application then you will hit that throwError "foo" line

Daniel Selsam (Feb 18 2021 at 03:26):

FYI the commit above just picks the first post-processor that doesn't return none. It doesn't iterate multiple post-processors.

Last updated: May 02 2025 at 03:31 UTC