Zulip Chat Archive

Stream: mathlib4

Topic: Point&Click library rewrite tactic

Jovan Gerbscheid (Feb 17 2024 at 19:42):

I have made a library rewrite tactic, using my RefinedDiscrTree, which works with shift-clicking, using ProofWidgets. It suggests rewrites for the selected expression. When choosing a selection, it will paste a suitable application of the rw tactic into the editor. For now I've called it rw??, since rw? is already taken by the not point&click library rewrite tactic.

The code is here:

https://github.com/leanprover-community/mathlib4/pull/10612

It is co-authored by @Anand Rao Tadipatri .

Jovan Gerbscheid (Feb 17 2024 at 19:51):

The cache isn't pre-built currently, so it takes a minute to build.

Patrick Massot (Feb 18 2024 at 03:23):

This sounds great!

Jovan Gerbscheid (Feb 19 2024 at 16:36):

The cache has been figured out as well now.

Jovan Gerbscheid (Feb 19 2024 at 17:35):

image.png

Here is a screenshot of what the tactic looks like. The results are partitioned into sections, where each section corresponds to a particular pattern from the discrimination tree. The sections are sorted by how good the pattern matches. Within each section, the lemmas are sorted by number of parameters, then by whether it is left to right or right to left, and then by string length of the name. In this example you can see that the patterns a * b⁻¹ and a * a⁻¹ are different to my discrimination tree, and a * a⁻¹ gets a better score.

Jovan Gerbscheid (Feb 19 2024 at 17:40):

If the same pattern appears in multiple places, then we insert {occs := .pos [..] } into the rw tactic call.

Patrick Massot (Feb 19 2024 at 17:52):

Wouldn’t it be more efficient to display what the goal would become (still will diff highlighting)?

Patrick Massot (Feb 19 2024 at 17:53):

Or simply what the selected subterm would become. Seeing the whole lemma signature is really cluttered.

Patrick Massot (Feb 19 2024 at 17:55):

And needing to use the color code to understand where is the result is also very inefficient (even for the lucky people who can see colors). We don’t care whether we are rewriting from left to right or right to left since the widget will handle that (I guess).

Patrick Massot (Feb 19 2024 at 17:56):

I also think it would be nice if the interface could provide the option of not specifying the occurence when users want to rewrite everywhere.

Jovan Gerbscheid (Feb 19 2024 at 18:22):

I agree that printing the whole lemma makes it cluttered, and that we shouldn't need the colour code to figure out which way the rewriting goes. At the same time, I don't want each lemma to take up more lines than it currently does, because the list of possible lemmas can get quite long, so printing the whole entire new goal for every lemma isn't an option.

So, A possible change would be to display the new sub-term for each lemma. However, this doesn't provide full information, because some lemmas create new extra goals if the lemma has extra hypotheses (or uninstantiated variables). For this, one would have to also show the new side-goals to the user. There are some details to be filled in about the exact layout, but I think it is the best way.

Jovan Gerbscheid (Feb 19 2024 at 18:24):

As for rewriting everywhere, it is easy enough to delete the occurrence part from the rw tactic. If this was supported by the widget, how would you envision this option being selected by the user?

Patrick Massot (Feb 19 2024 at 19:10):

I agree that new goals also need to be displayed somehow. And being able to rewrite everywhere is very low priority compared to figuring out those other questions.

Jovan Gerbscheid (Feb 19 2024 at 19:44):

By the way, on the topic of new goals, the rw tactic is too eager with introducing new goals into the tactic state. I think it should only add metavariables to the tactic state that are created by that rw call. For example in

example : ∃ n : Nat, n = n - 1 := by
  use ?n
  rw [Eq.refl (?n - 1)]

after the rw tactic there are suddenly 2 goals for instantiating ?n.

Jovan Gerbscheid (Feb 19 2024 at 19:45):

It should be easy to fix.

Jovan Gerbscheid (Feb 19 2024 at 21:07):

Also, if the type of the lemma is not displayed, then it would be nice if the lemma name was displayed as an expression, so that hovering over it displays the lemma type, and ctrl+clicking on it would take you to its definition. In that case the button for doing the rewrite would be separate from the lemma name.

Jovan Gerbscheid (Feb 20 2024 at 18:32):

image.png
I've made some changes, and this is the new view. The side goals are shown on a new line with a ⊢. I would like to have the button on the same line as the lemma, so that the view is more compact and more results can be seen at the same time.

Jovan Gerbscheid (Feb 20 2024 at 18:34):

I'm not at all experienced in making interfaces, so if you know this better, feel free to help.

Jovan Gerbscheid (Feb 20 2024 at 18:38):

I should also mention that the widget crashes a lot. Whenever I go to a different file to make some edits and then come back to my test file to test the changes, shift-clicking on the goal makes lean stop working. I then have to press ctrl+shift+x to reload the file and then it magically works again. It seems to also be avoided by deleting the rw?? tactic manually and retyping it, but not always.

Patrick Massot (Feb 20 2024 at 18:52):

Jovan Gerbscheid said:

I should also mention that the widget crashes a lot. Whenever I go to a different file to make some edits and then come back to my test file to test the changes, shift-clicking on the goal makes lean stop working. I then have to press ctrl+shift+x to reload the file and then it magically works again. It seems to also be avoided by deleting the rw?? tactic manually and retyping it, but not always.

@Wojciech Nawrocki :up:

Patrick Massot (Feb 20 2024 at 18:53):

Is the layout issue coming from using the TryThis framework?

Jovan Gerbscheid (Feb 20 2024 at 18:55):

For the buttons I use ProofWidgets.MakeEditLink. I don't know though whether it is responsible for the crashing

Anand Rao Tadipatri (Feb 21 2024 at 13:37):

Patrick Massot said:

I also think it would be nice if the interface could provide the option of not specifying the occurence when users want to rewrite everywhere.

Perhaps this could be handled through a code action that shows up after the tactic is pasted (the gif below demonstrates a rough prototype).

rw_config_delete_demo.gif

Jovan Gerbscheid (Feb 21 2024 at 16:55):

image.png
Here's a new iteration of a similar layout. Let me know which changes you think should be kept and which not.
The idea of putting a button on the same line as the result turns out to be more tricky than I thought, because it is unclear where the button should go exactly and what happens when there is not enough space.

Patrick Massot (Feb 21 2024 at 17:02):

I think that including rw is only cluttering the list.

Patrick Massot (Feb 21 2024 at 17:03):

One could even argue we don’t care about the lemma name, only about the new expression and the side goals.

Jovan Gerbscheid (Feb 21 2024 at 17:37):

If we're not showing the lemma names, then there will be multiple library results that look exactly the same, and they just differ by which lemma gets pasted. So in that case some filtering of equal results could be done.

It's surprising to see how many lemmas in Mathlib have multiple names, and I feel like sometimes one name is more "correct" than another. e.g. Mathlib.Tactic.RingNF.mul_assoc_rev doesn't feel right instead of mul_assoc. Is there a way to filter out such silly duplicates from the beginning? Otherwise, I sort the lemmas by whether they are left-to-right or right-to-left, and then by name length. So for reverse associativity, Mathlib.Tactic.RingNF.mul_assoc_rev will show up first :(

Jovan Gerbscheid (Feb 21 2024 at 17:38):

I suppose I could filter out all lemmas in the namespace Mathlib.Tactic.

Patrick Massot (Feb 21 2024 at 17:55):

Yes those lemmas are completely internal to the tactic.

Jovan Gerbscheid (Feb 21 2024 at 18:36):

Ok, I will exclude all lemmas in the Mathlib namespace.

Jovan Gerbscheid (Feb 21 2024 at 18:39):

On the topic of it crashing, when it crashes and when I have set_option profiler true enabled, I get a long log of errors that looks like this:

typeclass inference of DivisionCommMonoid took 110ms
typeclass inference of Mul took 120ms
typeclass inference of LE took 106ms
typeclass inference of LE took 121ms
typeclass inference of LE took 107ms
typeclass inference of Fintype took 113ms
typeclass inference of Fintype took 111ms
typeclass inference of Fintype took 102ms
typeclass inference of Fintype took 103ms
typeclass inference of Fintype took 145ms
typeclass inference of Fintype took 103ms
typeclass inference of Mul took 100ms
typeclass inference of LE took 108ms
...

So apparently my program is still running in some form.

Jovan Gerbscheid (Feb 21 2024 at 18:53):

(deleted)

Jovan Gerbscheid (Feb 21 2024 at 19:20):

Something I liked about the original view is that you could tell for each section what matching pattern the section corresponded to. So in that example these patterns are a * a⁻¹, a * b⁻¹ and a * b respectively. So I'm considering putting the matching pattern as a title for each section. However I'm not sure how to display this exactly, because the patterns a * b with type Nat, or a * b with a general type are different patterns. What do you think about this?

Patrick Massot (Feb 21 2024 at 19:23):

I think that ambiguity is fine.

Emilie (Shad Amethyst) (Feb 21 2024 at 19:30):

I do prefer getting that additional information

Jovan Gerbscheid (Feb 21 2024 at 19:32):

I assume Patrick is talking about the ambiguity in the type information, and Emilie about showing the pattern at all?

Patrick Massot (Feb 21 2024 at 19:33):

Yes this is what is was talking about.

Emilie (Shad Amethyst) (Feb 21 2024 at 19:42):

You assumed right :)

Jovan Gerbscheid (Feb 21 2024 at 20:46):

image.png
Here's my updated version with the pattern information. I show the pattern as it appears in the first lemma.

Now each of the sections is its own drop-down, so that you can close it to more easily to look at other sections.

Patrick Massot (Feb 21 2024 at 20:57):

I still think it would be nicer to hide the lemma name and make the new expression being the link. This way users would click on what they want to see after rewriting.

Jovan Gerbscheid (Feb 21 2024 at 21:24):

image.png
Here's what that may look like:

Jovan Gerbscheid (Feb 21 2024 at 21:40):

Notice that one of the rewrites looks like it is an identity rewrite. This is the lemma MonCat.mul_of, and it rewrites the type of the multiplication. It feels a bit uncomfortable that such a lemma exists where the two sides look exactly the same.

Jovan Gerbscheid (Feb 21 2024 at 21:57):

The lemma is used only once in mathlib, and where it is used, simply removing it doesn't break the proof :rolling_on_the_floor_laughing:

https://github.com/leanprover-community/mathlib4/blob/master/Mathlib/RepresentationTheory/Action/Basic.lean#L381

Emilie (Shad Amethyst) (Feb 21 2024 at 22:14):

How hard would it be to support an option to choose between having a compressed view (which I guess would be the default), and having a longer view (with the name of the theorems suggested)?

Jovan Gerbscheid (Feb 21 2024 at 22:24):

It would be easy to do this with a config argument in the tactic.

But I imagine it is better if there is a button saying switch to detailed view, so that you can do the switch directly. I'm not sure how to implement that. But I can imagine such a design pattern being useful in general and maybe this could be a good addition to ProofWidgets.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 21 2024 at 22:43):

Jovan Gerbscheid said:

I should also mention that the widget crashes a lot. Whenever I go to a different file to make some edits and then come back to my test file to test the changes, shift-clicking on the goal makes lean stop working. I then have to press ctrl+shift+x to reload the file and then it magically works again. It seems to also be avoided by deleting the rw?? tactic manually and retyping it, but not always.

Hi, I tried the branch but couldn't reproduce this. What do you mean exactly by Lean stopping to work? Is it that in the editor panel, if you edit something, Lean no longer reports errors or messages? It would help to know the exact sequence of things you did to get into such a situation.

Jovan Gerbscheid (Feb 21 2024 at 22:47):

What happened was that I shift-click an expression, and then the expression gets highlighted, but the widget completely freezes. Even if I move the cursor, the widgets stays how it is.

But recently it seems to be happening less. And also for me it is hard to reproduce, since it seems to happen whenever I leave the file open while working on another file for a while and then I come back.

Jovan Gerbscheid (Feb 21 2024 at 22:49):

And yes, lean no longer reports errors or messages

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 21 2024 at 22:55):

Jovan Gerbscheid said:

And yes, lean no longer reports errors or messages

In the editor panel? Whether they show up in the infoview is quite independent of whether they show up in the editor. I would be very surprised if the infoview crashing influenced the editor. The one exception to that is if it freezes up the entire VSCode window, which may actually be what's happening in your case. This can happen if you print out lots and lots of expressions (e.g. there are tons of applicable rewrites)

Jovan Gerbscheid (Feb 21 2024 at 22:58):

I believe that errors weren't showing up in the editor either, although I'd have to double check it when the error occurs.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 21 2024 at 23:01):

Sure, if it happens again feel free to ping me with what you did exactly and I can look into it :) A short recording would be even better

Jovan Gerbscheid (Feb 22 2024 at 00:33):

Widgetbug.mp4

I started recording this video after the bug had started. But to initiate the bug I didn't do anything in particular. I just added some set_option to the beginning and tried the tactic again. And after reloading this worked fine again. In the video you can see me typing in the editor, but the letters are appearing quite a bit more slowly than I'm typing them. By not reloading the window for a while in this state, my laptop fan start making noise.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 22 2024 at 17:10):

Okay, here is what I think is happening: while the function LibraryWrite.rpc is still processing, all the rest of UI is blocked (in particular, editor requests are not being responded to). If you waited for a while it would eventually update (it does on my machine), but at any rate the blocking behaviour is an issue.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 23 2024 at 00:30):

More precisely: the Lean server uses tasks to respond to requests from the editor and the infoview. Every time something changes (e.g. the cursor moves, or we shift-click select something), a new request is sent and a new task is started in the server to respond to it. This task occupies one thread in the thread pool; but there is a fixed amount of threads in the thread pool (depending on how many CPU cores you have iirc). So if the task is long-running (because, for instance, it searches the whole library for applicable rewrites), it might happen that a bunch of these long-running tasks will occupy the entire thread pool, thus leaving no room for tasks that would respond to editor requests. There are fundamentally two underlying issues: one is that the infoview sends too many requests, and the server ends up doing redundant/duplicate work; the other is that there is no way to cancel RPC requests. The latter one we could theoretically hack around in ProofWidgets (except for this issue), but a proper solution would involve changing the RPC protocol to support that.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 23 2024 at 02:23):

Ok, ProofWidgets4#46 should help with these performance issues. But you'll have to wait until it is all merged :)

Kim Morrison (Feb 23 2024 at 04:02):

Jovan Gerbscheid said:

Ok, I will exclude all lemmas in the Mathlib namespace.

I think you should only exclude lemmas in Mathlib.Tactic.

I am still hoping to convince the maintainers that everything in Mathlib needs to be in the Mathlib namespace! :-)

Johan Commelin (Feb 23 2024 at 04:19):

@Wojciech Nawrocki Does the infoview need its own task scheduler?

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 23 2024 at 04:20):

I hope not! The point is just that there are sometimes tasks on the server that we launch due to a UI action and want to cancel early when the user decides to view something else instead.

Eric Rodriguez (Feb 23 2024 at 09:51):

Scott Morrison said:

Jovan Gerbscheid said:

Ok, I will exclude all lemmas in the Mathlib namespace.

I think you should only exclude lemmas in Mathlib.Tactic.

I am still hoping to convince the maintainers that everything in Mathlib needs to be in the Mathlib namespace! :-)

There is absolutely no internal difference between the current status quo vs this + open Mathlib at the start of every file right? And then externally the benefits are huge

Jovan Gerbscheid (Feb 23 2024 at 10:28):

Wojciech Nawrocki said:

So if the task is long-running (because, for instance, it searches the whole library for applicable rewrites)

Under normal circumstances (i.e. when this bug doesn't happen) rw?? runs in less than a second.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 23 2024 at 18:03):

Hm, when trying it out it would consistently take a few seconds between clicking a subexpression and seeing the list of rewrites.

Jovan Gerbscheid (Feb 24 2024 at 10:15):

Ah, good to know, this might be because I have a new laptop.

Jovan Gerbscheid (Feb 27 2024 at 16:47):

widgetStartFinish.mp4

I've got another clue on what is going wrong. I added a dbg_trace "Starting"; to the start of my server_rpc_method, and another dbg_trace to the end which tells the amount of miliseconds since starting. I would have expected the function to be called at most once whenever the cursor is moved or an expression is selected in the goal. However, the function is being called repeatedly, and computing the library result is done on many different threads at the same time, and whenever a thread finishes, it starts again with computing the library results. My hypothesis is that this only sometimes leads to a crash but that it is the underlying problem.

Jovan Gerbscheid (Feb 27 2024 at 16:47):

@Wojciech Nawrocki

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Feb 27 2024 at 18:29):

This is exactly right, thanks for the report! I fixed it in the PR linked above, specifically this commit.

Jovan Gerbscheid (Feb 27 2024 at 19:33):

When a rewrite lemma adds an extra goal, such as in

IsUnit.mul_inv_cancel {α : Type u} [DivisionMonoid α] {a : α} (a✝ : IsUnit a) : a * a⁻¹ = 1

then the rewrite call that I paste into the editor is rw [IsUnit.mul_inv_cancel ?_]. However, this doesn't specify the value of a, because it is an implicit argument. But for the purpose of this tactic, the argument a really should be explicit. This causes a problem when the pattern a * a⁻¹ appears in multiple places with different values of a. This is because in the code for finding the occurrence, I assume that the a is already instantiated (since the argument ?_ has a known type). Ideally I would want the rewrite call to be something like rw [IsUnit.mul_inv_cancel (?_ : IsUnit g)].

I could also set the option pp.explicit to true, but that would also mean that + would be written as HAdd.hAdd which is not desirable. Maybe another way would be to print it such that only the lemma itself gets all its arguments explicit. It could also be possible to not provide any arguments to the lemma by default, and deal with the occurrences like that, but I think that would be a bit less efficient.

What do you think about this?

Anand Rao Tadipatri (Feb 28 2024 at 14:41):

I experimented with hacking together a delaborator that pretty prints the function application with all implicits included, without the downsides of using pp.explicit.

Jovan Gerbscheid (Feb 28 2024 at 22:38):

Yeah that is one of the options I was thinking of. For implementation though I think it is easier to just turn the lemma arguments into String first, and then combine them with spaces in between, adding brackets where necessary. And writing the instance arguments with an underscore. (Because the function converting Syntax to String seems to add spaces incorrectly)

Jovan Gerbscheid (Feb 28 2024 at 22:41):

What I could do is to only use this alternative way of writing when there are uninstantiates metavariables.

Patrick Massot (Feb 28 2024 at 22:41):

I find it suspicious to read that Syntax to String doesn’t work. Are you sure your syntax is correct?

Jovan Gerbscheid (Feb 28 2024 at 22:42):

Maybe Syntax.reprint is a bad function? It just adds spaces everywhere unnecessarily

Patrick Massot (Feb 28 2024 at 22:44):

Why not using docs#Lean.PrettyPrinter.ppTerm ?

Jovan Gerbscheid (Feb 28 2024 at 22:48):

How do we get the PPContext though?

Patrick Massot (Feb 28 2024 at 22:51):

Why do you want one?

Jovan Gerbscheid (Feb 28 2024 at 22:53):

Oh sorry I was looking at the wrong ppTerm

Siddhartha Gadgil (Feb 29 2024 at 06:48):

Jovan Gerbscheid said:

Maybe Syntax.reprint is a bad function? It just adds spaces everywhere unnecessarily

Indeed it also gives errors sometimes as it skips the "parenthesizer" step.

Jovan Gerbscheid (Mar 01 2024 at 18:17):

I've now made a new version that prints the lemma application explicitly when there are new goals, based on @Anand Rao Tadipatri's approach.
I also made it so that the pasted string is not affected by any set_options that the user might set.

Jovan Gerbscheid (Mar 03 2024 at 16:36):

(deleted)

Jovan Gerbscheid (Mar 04 2024 at 17:49):

What is the opinion on having a view with lemma names as an alternative to the one where you just see the new expressions? I think not seeing the lemma names is generally nicer, but for users interested in learning the lemma names it could be useful to be able see them.

Jovan Gerbscheid (Mar 04 2024 at 17:49):

I'm thinking of adding some filtering to the rewrites: if two rewrites result in the same new expression then to only display one of them, especially if one has no side goals and another one does. I'm already filtering out rewrites that don't change the expression. But in the version with lemma names, I don't want to filter because in that view you want to see the names of all lemmas that apply.

Jovan Gerbscheid (Mar 04 2024 at 17:49):

Another question about filtering is whether to only filter within each section, or also between sections. For example there is commutativity of Nat and commutativity of general addition, which do the same rewrite, but they are in different sections. In that case it seems more intuitive to display both results.

Jovan Gerbscheid (Mar 04 2024 at 17:52):

image.png

You can see here that there are some duplicates

Patrick Massot (Mar 04 2024 at 17:54):

I think that filtering out duplicate results is essential. But indeed having a variant which display the lemma name without filtering could be useful in case some stupid lemma applies and would look weird in the proof script.

Jovan Gerbscheid (Mar 04 2024 at 17:59):

I just noticed that a later result in this example is Linarith.eq_of_eq_of_eq. Is there a way to tell whether a lemma is defined inside Mathlib.Tactic, so that I can filter out all of those?

Jovan Gerbscheid (Mar 04 2024 at 18:03):

Do you know of a good way to switch the view (to the version with lemma names) while being in the widget? Or do you think it would be preferable to have to write something like (config := {showLemmaNames := true})?

Patrick Massot (Mar 04 2024 at 20:15):

You could easily have a check-box somewhere near the top of the widget controlling what is shown.

Jovan Gerbscheid (Mar 04 2024 at 21:30):

I tried to do the filtering of duplicate lemmata using equality on the Exprs, but I ran into the issue that sometimes the instances are defeq, but not equal, for example the 1s given byinv_mul_self and IsUnit.inv_mul_cancel are not equal.
Doing pairwise isDefEq seems like it could be quite expensive.
Since the user can only differentiate between the expressions based on the printed string, I think it might be better to do the filtering using these Strings. This then also solves the problem of MonCat.mul_of.{u_1} {A : Type u_1} [inst✝ : Monoid A] (a b : A) : a * b = a * b, where the two sides are defeq, but not equal. It is super confusing when this lemma shows up in the library search.

Jovan Gerbscheid (Mar 04 2024 at 21:31):

image.png

This is what the above example looks like with the filtering.

Jovan Gerbscheid (Mar 04 2024 at 21:32):

I don't really like the solution of filtering on the string, because it doesn't seem principled, but I think it is the best solution.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 05 2024 at 07:24):

Wojciech Nawrocki said:

This is exactly right, thanks for the report! I fixed it in the PR linked above, specifically this commit.

This has now landed in the most recent mathlib. The RPC function will not be called excessively many times, and you can add some IO.checkCanceled calls in long-running loops for extra-quick cancellation of redundant tasks to reduce CPU usage.

Jovan Gerbscheid (Mar 05 2024 at 11:10):

Would it be possible to show the library results to the user while they are still being computed? The computation checks the results one by one, so it would be natural if the results could appear one by one. This might help speed up the interaction.

Jovan Gerbscheid (Mar 05 2024 at 13:14):

I've updated to the most recent mathlib. The tactic runs a lot faster now: the example of g⁻¹ * g now runs in 200ms. This is partially because I let isDefEq run in the reducible transparency instead of default. I can't tell whether the rest of the speedup is because of the update to ProofWidgets, or because of other updates in Lean. Because of this speedup, I removed the bound on the amount of heartbeats that the tactic can use.

Thanks to the ProofWidgets update, the function now only gets called one time after a shift click, which is great.

I also played a bit with IO.checkCancelled, but it seemed to always return false.

Jovan Gerbscheid (Mar 05 2024 at 13:19):

(It seems reasonable to expect the ProofWidgets update to cause the speedup, but I seem to remember having run the function directly, and that not being faster, meaning that ProofWidgets wasn't the reason)

Jovan Gerbscheid (Mar 05 2024 at 14:30):

The current version of rw?? does not check whether the selected expression is in the main goal. So if you have multiple goals, and you select an expression not in the first goal, then the pasted rewrite will be applied to the wrong goal. I don't know how to check that the selected expression is in the main goal. I get the MVarId of this goal, but I cannot compare it to Lean.Elab.Tactic.getMainGoal because it isn't running inside a TacticM monad. Any ideas?

Patrick Massot (Mar 05 2024 at 14:53):

I don’t remember how I did it in https://github.com/leanprover-community/mathlib4/blob/master/Mathlib/Tactic/Widget/SelectPanelUtils.lean but it was clearly possible.

Jovan Gerbscheid (Mar 05 2024 at 17:13):

Ok, I managed to do it.

Jovan Gerbscheid (Mar 05 2024 at 17:17):

Another problem is with shadowed variables. When the user selects an expression in an unreachable hypothesis, or an expression containing an unreachable variable, the pasted tactic string doesn't work because it contains a ✝ symbol.

Jovan Gerbscheid (Mar 05 2024 at 17:18):

I suppose that this would be a reason to not provide arguments to the rewrite lemma.

Patrick Massot (Mar 05 2024 at 17:20):

If this really bothers you then you could output a rename_i call first.

Jovan Gerbscheid (Mar 05 2024 at 17:25):

On the topic of shadowed variables, I use

    let lctx := md.lctx |>.sanitizeNames.run' {options := (← getOptions)}

which I copied since similar functions also use this. But when I removed it it seemed to make no difference.

Patrick Massot (Mar 05 2024 at 18:03):

I can’t help here. Every time I write this line it is cargo cult coding.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 05 2024 at 21:08):

I also played a bit with IO.checkCancelled, but it seemed to always return false.

Ah sorry, you need to also use @[server_rpc_method_cancellable] instead of @[server_rpc_method].

Jovan Gerbscheid (Mar 05 2024 at 23:20):

Ah thanks, I tried it again but I'm still confused. If I don't change the implementation, and just use @[server_rpc_method_cancellable] instead of @[server_rpc_method], then it is able to cancel the computation. When I quickly select and unselect an expression, the runtime of the function (measured by the profiler and profileitM) is shorter. This wouldn't happen with @[server_rpc_method].

I also tried using IO.checkCancelled: in the function computing the results, I put this before checking each lemma:

      if ← IO.checkCanceled then
        dbg_trace "cancelled"; pure ()
        return #[]

When I quickly select and unselect an expression, what usually happens is that the computation is cancelled, but the trace message doesn't appear. But if I try over and over, the trace message does appear in 10%-ish of the cases.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 05 2024 at 23:56):

what usually happens is that the computation is cancelled, but the trace message doesn't appear

How do you know it was cancelled if you don't see the dbg_trace output? It does happen quite often that the computation is quick enough to finish before the cancellation gets through to the server.

Jovan Gerbscheid (Mar 05 2024 at 23:58):

The time given by the profiler is consistently a bit above 200ms, and when I quickly select and deselect, it is around 100ms. If I do it a bit more slowly it gets closer to 200ms.

Jovan Gerbscheid (Mar 06 2024 at 00:01):

And with @[server_rpc_method], the time doesn't depend on how quickly I deselect.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 06 2024 at 00:18):

I see! This is because your message gets printed only if the IO.checkCanceled in your code is hit. However, Lean core has a bunch of places (look for check_system) which also check this; they do not print any message but rather raise an "elaboration interrupted" exception. You can confirm this is happening by wrapping your code in a try .. catch and printing out the exception, which will show up every time (unless your check hits, in which case "cancelled" will be printed).

Jovan Gerbscheid (Mar 06 2024 at 00:47):

Ah, that makes sense. IsDefEq and whnf already do this check, so there is no need to manually add it.

Jovan Gerbscheid (Mar 06 2024 at 11:55):

I figured out that Environment.header has the information about all imported modules, including the module names and the constants defined in each module. So by getting the constants from there, I can skip over everything from modules starting with `Mathlib.Tactic

Jovan Gerbscheid (Mar 06 2024 at 12:44):

image.png

The tactic tries 2 kinds of lemmas, lemmas that are in the cache, and lemmas that are defined in the current file. This is the same as for the other library search tactics in mathlib. However, this approach is only complete if you're working in a file that imports Mathlib and nothing else. I feel like it would be better if it is also able to find lemmas that are imported but are not in Mathlib. Do you think this is useful? It might have the drawback that indexing all these extra lemmas could take a long time. This can be partially solved by caching it for the duration that the file is open.

Patrick Massot (Mar 06 2024 at 14:52):

It is crucial for projects depending on Mathlib. There are quite a few of them currently running or about to start running.

Patrick Massot (Mar 06 2024 at 14:53):

For instance the analytic number theory project will certainly want to use rw?? using its own lemmas.

Jovan Gerbscheid (Mar 06 2024 at 14:58):

Do these other projects have access to a CI that could build the cache? Or are they small enough where locally building the DiscrTree for non-mathlib lemmas is feasible?

Patrick Massot (Mar 06 2024 at 15:02):

We should definitely write documentation about setting up CI. All those projects are hosted on GitHub and can run CI there since they don’t have the huge CI needs of Mathlib. @Mac Malone will also probably want to say something about the future of this in reservoir.

Jovan Gerbscheid (Mar 06 2024 at 15:27):

Ok, in that case I could leave the implementation the same, and expect those projects to extend the mathlib cache.

Kim Morrison (Mar 07 2024 at 00:47):

I would strongly recommend looking at Joe Hendrix's lazy DiscrTree implementation that powers the new exact?. It is sufficient fast that no cache is required anymore.

Kim Morrison (Mar 07 2024 at 00:48):

In particular, Mathlib's CI step building the exact? cache is now completely pointless: no one looks at that cache!

Kim Morrison (Mar 07 2024 at 00:48):

However the cache for rw? is still used, because it hasn't been upgrade to use a lazy discr tree.

Jovan Gerbscheid (Mar 07 2024 at 02:08):

That's a good point, although I think it will take quite some work to get my DiscrTree to work lazily. Both because I will have to understand exactly how it is implemented in LazyDiscrTree, and because my current implementation is in two steps: first getting the expression in the discrimination tree form, and then inserting it, which doesn't lend itself for laziness.

Jovan Gerbscheid (Mar 07 2024 at 11:19):

I seem to remember reading that initializing the LazyDiscrTree takes a non-trivial amount of time. Has that been solved somehow?

Jovan Gerbscheid (Mar 08 2024 at 14:05):

I played around with IncDiscrTreeFinder.mkImportFinder, the function that initiates the LazyDiscrTree. It takes around 1.5s on my machine, but if I copy the definition and run that instead, then it takes over 30s. I tried replicating this on the lean playground, and there the difference is less severe, with an increase from 2s to 6s:

import Mathlib

open Lean Meta LazyDiscrTree LibrarySearch

private def constantsPerTask : Nat := 6500

private def addImport (name : Name) (constInfo : ConstantInfo) :
    MetaM (Array (InitEntry (Name × DeclMod))) :=
  forallTelescope constInfo.type fun _ type => do
    let e ← InitEntry.fromExpr type (name, DeclMod.none)
    let a := #[e]
    if e.key == .const ``Iff 2 then
      let a := a.push (←e.mkSubEntry 0 (name, DeclMod.mp))
      let a := a.push (←e.mkSubEntry 1 (name, DeclMod.mpr))
      pure a
    else
      pure a
def mkImportFinder : IO CandidateFinder := do
  let ref ← IO.mkRef none
  pure fun ty => do
    let ngen ← getNGen
    let (childNGen, ngen) := ngen.mkChild
    setNGen ngen
    let importTree ← (←ref.get).getDM $ do
      profileitM Exception  "librarySearch launch" (←getOptions) $
        createImportedEnvironment childNGen (←getEnv) (constantsPerTask := constantsPerTask) addImport
    let (imports, importTree) ← importTree.getMatch ty
    ref.set importTree
    pure imports

elab "search" e:term : tactic => do
  let e ← Elab.Term.elabTerm e none
  let a ← mkImportFinder -- IncDiscrTreeFinder.mkImportFinder
  let b ← a e
  logInfo m! "{b.size}"

set_option profiler true

example : True := by
  search Nat.zero -- 5299

Jovan Gerbscheid (Mar 08 2024 at 14:23):

What I also don't understand is why the profiler claims that this search tactic takes only about 1s. It clearly takes at least as long as running mkImportFinder which takes more time.

Jovan Gerbscheid (Mar 08 2024 at 14:42):

~~While looking at the new LibrarySearch code, I found this function that I think is defined incorrectly:~~

/--
Return an action that returns true when  the remaining heartbeats is less
than the currently remaining heartbeats * leavePercent / 100.
-/
def mkHeartbeatCheck (leavePercent : Nat) : MetaM (MetaM Bool) := do
  let maxHB ← getMaxHeartbeats
  let hbThreshold := (← getRemainingHeartbeats) * leavePercent / 100
  -- Return true if we should stop
  pure $
    if maxHB = 0 then
      pure false
    else do
      return (← getRemainingHeartbeats) < hbThreshold

~~because the inequality that it checks is basically h < h * leavePercent / 100 where h is the remaining heartbeats.~~

NeverMind, I didn't spot that it was a nested MetaM inside a MetaM

Jovan Gerbscheid (Mar 08 2024 at 14:43):

(deleted)

Joachim Breitner (Mar 08 2024 at 14:44):

It takes around 1.5s on my machine, but if I copy the definition and run that instead, then it takes over 30s.

Maybe that’s becuase as long as it is part of core, it is compiled to native code, but when the code is part of a library, it is run by lean’s interpreter?

Jovan Gerbscheid (Mar 08 2024 at 14:46):

Is there any way around that? i.e. can I run my own code as native code?

Joachim Breitner (Mar 08 2024 at 14:47):

I am not firm on the details; but I think the precompiledModules option plays a role

Jovan Gerbscheid (Mar 08 2024 at 15:01):

Why is there a difference between the interpreter, and native code? And does this mean that if LazyDiscrTree was defined in Mathlib, then it would be this slow? If so, then I have little hope of making my RefinedDiscrTree lazy in Mathlib.

Jovan Gerbscheid (Mar 08 2024 at 15:03):

I believe IncDiscrTreeFinder.mkImportFinder uses parallel computation to speed things up, so that might explain why there is the difference in speed, if the interpreter doesn't do the parallel computation.

Jovan Gerbscheid (Mar 08 2024 at 15:11):

Although that happens in createImportedEnvironment, of which I didn't copy the definition, so it doesn't really make sense

Sebastian Ullrich (Mar 08 2024 at 15:21):

The interpreter certainly supports parallelism, it's just slower

Jovan Gerbscheid (Mar 08 2024 at 15:47):

Here's a minimal example:

import Mathlib

open Lean

set_option profiler true

#eval show MetaM Unit from do
  _ ← Meta.LazyDiscrTree.createImportedEnvironment (α := Unit) default (← getEnv) (constantsPerTask := 6500) (fun _ _ => pure #[])

this takes around 6s on my laptop.

Jovan Gerbscheid (Mar 10 2024 at 21:38):

The above example directly calls a function from core, so then shouldn't it be compiled to native code? I don't understand why it is still so slow.

Joachim Breitner (Mar 10 2024 at 21:41):

Maybe the closure (fun _ _ => pure #[]) is expensive to call when interpreted and called often?

Jovan Gerbscheid (Mar 10 2024 at 21:43):

Ah, that would make sense

Jovan Gerbscheid (Mar 10 2024 at 21:48):

On the topic of the interface of rw??, I expect it is usual to want to do multiple library rewrites in a row. This is slightly inconvenient because you have to write rw?? again every time, and every rewrite gets its own rw call instead of one combined rw call.

I thought one way to solve this would be to allow rw?? to take arguments just like rw, and that instead of producing a rw tactic call, it can produce a rw?? tactic call, so that you can keep library rewriting. But, in that case the final proofs would have rw?? in them instead of rw, so you might have to manually remove the ?? everywhere.

Jovan Gerbscheid (Mar 11 2024 at 00:55):

Joachim Breitner said:

I think the precompileModules option plays a role

I tried setting the option precompileModules := true in the lakefile for Mathlib. Pre-compiling all of Mathlib takes long, so I work on a small subset of Mathlib. Before setting the option, the example

set_option profiler true
def foo : MetaM (LazyDiscrTree Unit) := do
  Meta.LazyDiscrTree.createImportedEnvironment (α := Unit) default (← getEnv) (constantsPerTask := 6500) (fun _ _ => pure #[])
#eval do
  _ ← foo

took about 1.2s. Importing it to another file and running foo there gives the same time. With the option set to true, the same example takes about 50s, while running foo in another file that imports it takes less than 0.1s. So the compilation does indeed work, but somehow evaluating functions defined in the file itself becomes painfully slow.

My understanding is that when running a tactic from mathlib, this is not pre-compiled, and therefore slower than it should be. My question then is, should we pre-compile everything in Mathlib.Tactic, so that tactics are much faster?

Jovan Gerbscheid (Mar 12 2024 at 19:14):

import Lean

open Lean Meta
set_option profiler true
set_option profiler.threshold 40
def foo : MetaM (LazyDiscrTree Unit) := do
  LazyDiscrTree.createImportedEnvironment default (← getEnv) (constantsPerTask := 6500) (fun _ _ => return #[])
#eval do
  _ ← foo

produces the following messages:

interpretation of foo._rarg._lambda_1._boxed took 58ms
interpretation of foo._rarg._lambda_1._boxed took 59.4ms
interpretation of foo._rarg._lambda_1._boxed took 45.7ms
interpretation of foo._rarg._lambda_1._boxed took 67.1ms
interpretation of foo._rarg._lambda_1._boxed took 47.5ms
interpretation of foo._rarg._lambda_1._boxed took 43.1ms
interpretation of foo._rarg._lambda_1._boxed took 40.7ms
interpretation of foo._rarg._lambda_1._boxed took 44.5ms
interpretation of foo._rarg._lambda_1._boxed took 77.8ms
interpretation of foo._rarg._lambda_1._boxed took 78.6ms
interpretation of foo._rarg._lambda_1._boxed took 85.6ms
interpretation of foo._rarg._lambda_1._boxed took 54.8ms
interpretation of foo._rarg._lambda_1._boxed took 104ms
interpretation of foo._rarg._lambda_1._boxed took 46.6ms
interpretation of foo._rarg._lambda_1._boxed took 40.1ms
interpretation of foo._rarg._lambda_1._boxed took 58.5ms

The exact values vary, but each time these times account for most of the calculation time. So @Joachim Breitner is right, but I don't understand how lean can spend 100ms calculating fun _ _ => #[].

Jovan Gerbscheid (Mar 12 2024 at 19:18):

I was going to complain about this being a problem when precompileModules := true, because then under certain circumstances the same calculation takes around 40s. But looking at the profiler messages, it turns out that even the usual 1s is problematic.

Jovan Gerbscheid (Mar 12 2024 at 23:54):

It seems my laptop is in a different mood now, and the profiler now gives much smaller values in the messages, closer to 10ms. But the total time of foo is still a bit less than 1s.

Jovan Gerbscheid (Mar 14 2024 at 16:55):

I tried to set precompileModules := true in all of mathlib (https://github.com/leanprover-community/mathlib4/pull/11362), but it takes too long to build. Is there a way to set precompileModules := true only for a few files? I would like to try to precompile only a few files, and then use the functions defined there on all of Mathlib.

Joachim Breitner (Mar 14 2024 at 17:00):

I believe I once read that you can define two libraries, the first one with that flag, and set the modules of interest as roots of that first module, it works.

But I fear this will not be acceptable for mathlib, because the compiled modules cannot be shared in the cache (they are platform dependent), and mathlib without a full cache isn’t fun either.

Jovan Gerbscheid (Mar 14 2024 at 17:08):

Ah, so core Lean being platform dependent is ok, but for Mathlib this is not acceptable?

Henrik Böving (Mar 14 2024 at 18:22):

Jovan Gerbscheid said:

Ah, so core Lean being platform dependent is ok, but for Mathlib this is not acceptable?

core Lean has the capability to distribute platform dependent binaries through elan, all other parts of the Lean eco system currently do not have such a capability

Jovan Gerbscheid (Mar 14 2024 at 20:03):

Patrick Massot said:

You could easily have a check-box somewhere near the top of the widget controlling what is shown.

Could someone help me with this? I have no prior JavaScript or Html experience, and so far I've avoided JavaScript by using a widget_module that was already made in Lean. But I'm not sure If one can make a check-box controlling what is shown using just Html.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 14 2024 at 20:09):

Jovan Gerbscheid said:

Patrick Massot said:

You could easily have a check-box somewhere near the top of the widget controlling what is shown.

Could someone help me with this? I have no prior JavaScript or Html experience, and so far I've avoided JavaScript by using a widget_module that was already made in Lean. But I'm not sure If one can make a check-box controlling what is shown using just Html.

You cannot :( One of the major limitations of mk_rpc_widget% (see the docstring) is that the component it outputs cannot have state

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 14 2024 at 20:09):

I can help with this next week, though!

Jovan Gerbscheid (Mar 14 2024 at 20:12):

Another option that I imagine might be useful is whether it shows lemmas that introduce new goals. Because often you want to just rewrite and not get any extra goals from it.

Jovan Gerbscheid (Mar 15 2024 at 14:12):

image.png
image.png

I've implemented the two different views, and for now you can switch between them with a config argument. When showing the lemma names, it shows all lemmas that apply instead of filtering them. I also fixed the font of the replacement button to be the code font.

Jovan Gerbscheid (Mar 15 2024 at 14:14):

image.png

And the lemma names are hoverable

Patrick Massot (Mar 15 2024 at 15:10):

Great!

Jovan Gerbscheid (Mar 15 2024 at 20:58):

When the tactic creates new goals, it takes the usernames from the forall binders in the lemma, but this sometimes causes name clashes when there already is a metavariable with that user name. I fixed it by making a function similar to LocalContext.getUnusedName, which creates an unused name from the suggested name.

However, it turns out that Lean is not careful at all with usernames of metavariables. In this example, there just happen to exist (instantiated) metavariables with usernames n and α, for no apparent reason:

example : 1+3 = 4 := by
  let a := ?n
  let b := ?α
  /-
  a : ℕ := 1
  b : Type := ℕ
  ⊢ 1 + 3 = 4
  -/

similarly, rw has some dubious name management for the metavariables it makes. Also curious is that there are MVarId.setTag and MVarId.setUserName which are identical functions for setting the user name, but for getting it there is only MVarId.getTag.

Jovan Gerbscheid (Mar 19 2024 at 11:38):

@Anand Rao Tadipatri wrote some typescript for having a checkbox at the top that controls which view is shown. Is it ok in Mathlib to have Typescript/Javascript, or should this be added to Proofwidgets instead?

Jovan Gerbscheid (Mar 19 2024 at 12:07):

image.png

This is what it looks like. Let me know what you think. I'm slightly worried that the checkbox can get in the way of your view when you don't want to use it.

Patrick Massot (Mar 19 2024 at 14:51):

I agree the current look of the checkbox is pretty distracting.

Patrick Massot (Mar 19 2024 at 14:53):

Jovan Gerbscheid said:

Is it ok in Mathlib to have Typescript/Javascript, or should this be added to Proofwidgets instead?

Does it mean it requires users of Mathlib to have a working node toolchain? If yes then it’s definitely not ok. But if you simply mean shipping some unreadable minified JS file sitting in a corner then I think it’s ok.

Patrick Massot (Mar 19 2024 at 14:54):

Also note you should probably think beyong Mathlib. Unless you have a surprising giant Mathlib dependency then I think Std will want this.

Jovan Gerbscheid (Mar 19 2024 at 17:53):

yeah, it would be an unreadable JS file

Jovan Gerbscheid (Mar 19 2024 at 17:57):

There is one Mathlib dependency, which is that the RefinedDiscrTree code contains some names like Pi.instAdd from Mathlib.Algebra.Group.Pi.Basic. This is because I unfold this specific instance of addition, in order to treat fun x => f x + g x the same as f + g.

Jovan Gerbscheid (Mar 19 2024 at 18:35):

I could imagine reimplementing this so that you can instead dynamically add a reduction procedure by tagging the function. And then the RefinedDiscrTree would always try all of these custom reduction procedures when looking up or inserting. Then this bit of the code could sit in Mathlib, while the main part is in Std.

Kim Morrison (Mar 19 2024 at 21:58):

I mean, Pi.instAdd can just move up. The operations on function types are something everyone needs, while the algebraic typeclasses carrying axioms can stay in Mathlib.

Jovan Gerbscheid (Mar 19 2024 at 22:05):

My code also contains Pi.instInv, so then Inv would also have to move up

Jovan Gerbscheid (Mar 20 2024 at 22:17):

Patrick Massot said:

I agree the current look of the checkbox is pretty distracting.

Does anyone have ideas on how best to deal with this?

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 20 2024 at 22:27):

If you'd like to match the aesthetics of the rest of the infoview, the thing to do is to replace the checkbox by an icon at the same height as the 'Rewrite suggestions' header, similarly to how there are icons to the right of 'Expected type' and 'Tactic state'

Jovan Gerbscheid (Mar 20 2024 at 22:29):

What sort of icon would work here?

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 21 2024 at 00:33):

There is a big list here, or you can just use some clickable blue text like the 'try this' links.

Jovan Gerbscheid (Mar 21 2024 at 00:45):

I just spotted how the tactic state allows you to filter what is shown by hovering over the filter icon. Copying this behaviour is probably the best way to do it.
image.png

Jovan Gerbscheid (Mar 21 2024 at 00:49):

That would also allow for more options/filters to be added in a way that doesn't disturb the original look.

Thomas Murrills (Mar 21 2024 at 01:07):

Just want to mention that if you were in a situation that needed a checkbox, the canonical way to match the theme might be to use the official VS code webview UI toolkit checkbox. (I’m not sure if the infoview actually incorporates and exposes these elements yet, though—this is just “in principle”.)

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 21 2024 at 05:08):

We do not expose that toolkit, we instead just use CSS to style things consistently; for instance to get codicons, you can do something like this

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 21 2024 at 05:10):

Jovan Gerbscheid said:

I just spotted how the tactic state allows you to filter what is shown by hovering over the filter icon. Copying this behaviour is probably the best way to do it.
image.png

Here is the implementation of that

Jovan Gerbscheid (Mar 23 2024 at 21:06):

image.png

I've put a filter button to the right of "Rewrite suggestions:", in the same style as the tactic state. It acts as a button for switching between filtered and non-filtered view. It defaults to the filtered view. I also experimented with using the filled filter icon for the filtered case, but since the filtered view is the default I found this more distracting than useful.

Jovan Gerbscheid (Mar 23 2024 at 21:09):

Something I find a bit awkward is that I only added the minified JS file and not the original TypeScript, meaning that nobody else can see the TypeScript. I feel like this could be a problem for both maintainability and transparency.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 23 2024 at 21:28):

Jovan Gerbscheid said:

Something I find a bit awkward is that I only added the minified JS file and not the original TypeScript, meaning that nobody else can see the TypeScript. I feel like this could be a problem for both maintainability and transparency.

Yep, it is pretty awkward and as of now we don't have any good solution. What we have been doing so far is to put the TypeScript sources in a package foo that mathlib depends on which is such that foo can build TypeScript, but mathlib can just pull in a foo build archive that contains the JS outputs. Currently this specification uniquely determines foo = ProofWidgets4; if the widget doesn't depend on anything mathlib-specific, you could PR it there and I would review the PR.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Mar 23 2024 at 21:28):

The other option is to write it in a readable .js file (i.e., without types) instead and just include_str that file.

Jovan Gerbscheid (Mar 23 2024 at 22:43):

I see. The TypeScript uses HtmlDisplay which is defined in but not exported by ProofWidgets, so it would be natural to add it to ProofWidgets. But I feel like it may be better to wait with PRing it there until we agree that this is indeed the right interface for rw??.

Jovan Gerbscheid (Mar 24 2024 at 11:52):

I discovered that there is a no_index annotation used to influence the DiscrTree indexing, so I now implemented this in the RefinedDiscrTree.

However this can cause the pattern to be printed incorrectly, printing also the part that is inside a no_index annotation. Is there a way to tell the pretty printer to use an ellipsis for expressions annotated with no_index?

Jovan Gerbscheid (Mar 24 2024 at 11:54):

I also noticed that the comment

-- See note [no_index around OfNat.ofNat]

appears all over Mathlib, but I can't seem to find this note.

Timo Carlin-Burns (Mar 24 2024 at 16:00):

We have to use no_index around OfNat.ofNat n where n is a variable due to lean4#2867

Timo Carlin-Burns (Mar 24 2024 at 16:01):

Here is the library note: https://github.com/leanprover-community/mathlib4/blob/a746c523bc0f831c9d1f39d77c382ac798748e02/Mathlib/Data/Nat/Cast/Defs.lean#L67

Jovan Gerbscheid (Mar 24 2024 at 16:35):

I think you meant lean4#2867

Timo Carlin-Burns (Mar 24 2024 at 16:36):

Thanks, edited

Jovan Gerbscheid (Mar 24 2024 at 23:13):

As far as I can tell, you've mostly fixed the issue in your draft PR that changes the behaviour of DiscrTree, which is to keep the OfNat.ofNat around number literals, instead of removing it as is the current behaviour. This would remove the need for the no_index annotations and thus would improve performance.

There are still some minor issues with OfNat.ofNat appearing twice or not appearing at all around number literals. But this could also be handled by the DiscrTree.

Is that right?

Jovan Gerbscheid (Mar 24 2024 at 23:13):

Once this lands in Lean, I'll copy the behaviour over to RefinedDiscrTree

Timo Carlin-Burns (Mar 24 2024 at 23:40):

Yep, although my current thinking is that it should not be the DiscrTree's job to fold nested OfNat.ofNat. Provided that it doesn't fold them, you could write a simp lemma OfNat.ofNat (OfNat.ofNat n) = OfNat.ofNat n

Jovan Gerbscheid (Mar 25 2024 at 20:09):

image.png

I think that when you want to rewrite something, one natural thing to do is to replace the expression by its definition. However, not every definition has a corresponding rewrite lemma in Mathlib, for example Function.Surjective doesn't have this. This means that rw?? doesn't suggest replacing this by its definition.

I still have an open PR in std, with a tactic unfold' which allows you to unfold a targeted expression, where unfolding can be replacing a constant or free variable by its definition, or unfolding a projection, or doing beta, eta or zeta reduction. I think it would be great if the rw?? suggestions would contain one section with the unfolded versions of the expression. When selecting one of these, we could paste the unfold' tactic. But it is also possible to do any definitional rewrite with rw [show current = replacement from rfl]. I could imagine this actually being better because it shows in the tactic call what the replacement will be. And then there is no need to wait for the other PR.

Often, replacing by definition isn't strictly necessary, but it can still be very useful to see the replaced definition in the goal.

Jovan Gerbscheid (Mar 25 2024 at 20:25):

These unfolding suggestions could also be implemented in some other tactic, e.g. unfold??, so that these results can be seen without doing the more expensive library search. Then the question would be whether we also want to see them when using rw??.

Patrick Massot (Mar 25 2024 at 21:05):

I think it’s nice to see them. Maybe there could be some visual indication that this is the definition.k

Jovan Gerbscheid (Mar 25 2024 at 23:27):

Here's what I have now:

image.png
image.png

Let me know any suggestions for the interface.

Jovan Gerbscheid (Mar 25 2024 at 23:29):

And I've defined unfold? to show just the definitional rewrites

Jovan Gerbscheid (Mar 26 2024 at 00:29):

I'm not so sure how to feel about this mess though

image.png

For these complicated expressions, the pasting into the editor tends to not be clean; for example I had to turn of match notation, and then still it isn't able to type check it unless I manually add the type for the literals 1 and 5

Patrick Massot (Mar 26 2024 at 01:54):

Why is the Function.Surjective pattern duplicated in your first screen-shot?

Patrick Massot (Mar 26 2024 at 01:55):

The ordinal stuff in the last screen-shot is indeed a nightmare but I don’t know what to do.

Jovan Gerbscheid (Mar 26 2024 at 07:47):

Patrick Massot said:

Why is the Function.Surjective pattern duplicated in your first screen-shot?

You can see the difference by hovering over the f. The first one is f : α → α and the second one is f : α → β.

Jovan Gerbscheid (Mar 26 2024 at 08:13):

image.png

Here's some more funny business, which seems to be a symptom of non-structurally recursive definitions.

Jovan Gerbscheid (Mar 26 2024 at 09:52):

Is there a way to get the naïve definition of a non-structurally recursive definition, instead of this funny _unary stuff? For example in the case of Nat.div that would be if 0 < y ∧ y ≤ x then Nat.div (x - y) y + 1 else 0

Jannis Limperg (Mar 26 2024 at 10:30):

You can get the unfolding equation for a function defined by wfrec (or any other recursive function):

import Lean

open Lean Lean.Meta

def foo (n : Nat) : Nat :=
  if n < 10 then foo (n + 1) else n
termination_by 10 - n

-- _private.Test.0.foo._unfold
-- ∀ (n : Nat), foo n = if n < 10 then foo (n + 1) else n
run_meta do
  let some eqn ← getUnfoldEqnFor? ``foo
    | throwError "no unfold equation"
  logInfo m!"{eqn}"
  let type := (← getConstInfo eqn).type
  logInfo m!"{type}"

Jovan Gerbscheid (Mar 26 2024 at 11:14):

Is there also a way to see whether the definition is defined by wf recursion? So that I can case split on that.

Jannis Limperg (Mar 26 2024 at 11:51):

@Joachim Breitner would know this.

Jovan Gerbscheid (Mar 26 2024 at 11:52):

I figured it out, I need the function Elab.WF.eqnInfoExt.find?, which also provides the definition.

Jovan Gerbscheid (Mar 26 2024 at 11:57):

image.png

This is what it does now for division. I guess the next thing to do would be figuring out how to unfold decidable if-else statement.

Joachim Breitner (Mar 26 2024 at 12:01):

Would it make sense to only show the first unfolding? If the user wants to go further, they can click on it and then see the next one, can’t they? (Maybe the produced tactic syntax could combine consecutive unfolding steps?)

Jovan Gerbscheid (Mar 26 2024 at 12:03):

For something like division, I think the result Div.div is the least useful one, which would be the first unfolding. So I think it makes sense to show more in that case.

Joachim Breitner (Mar 26 2024 at 12:03):

Also, at least for wf recursion, it’s probably better to treat them as opaque and only use the foo.eq_def and (better, if applicable) foo.eq_1… lemmas that lean produces (these are the names on master, they have different names before that.)

Jovan Gerbscheid (Mar 26 2024 at 12:06):

  if let .const n lvls := e.getAppFn then
    if let some info := Elab.WF.eqnInfoExt.find? (← getEnv) n then
      return (info.value.instantiateLevelParams info.levelParams lvls).beta e.getAppArgs

This is the code for unfolding the well founded recursive expression e. I think it is pretty clean

Jovan Gerbscheid (Mar 26 2024 at 12:09):

I presume you mean the lemmas foo._unfold and foo._eq_1, foo._eq_2 ..., which are generated by getUnfoldEqnFor? and getEqnsFor? respectively

Joachim Breitner (Mar 26 2024 at 12:29):

Right! These lemmas on master have public names (.eq_def, .eq_1) and are generated on demand, so rw [foo.eq_def] would be the canonical way to rewrite a recursive definition. (And using rw [foo.eq_1] instead will lead to smaller terms without lots of case on the right hand side)

Joachim Breitner (Mar 26 2024 at 12:30):

(BTW, you might want to test your code with mutually recursive functions as well, not sure if info.value will be the right one then)

Jovan Gerbscheid (Mar 26 2024 at 12:58):

There seems to be nothing wrong with mutually recursive functions.

Joachim Breitner (Mar 26 2024 at 12:58):

Ok, good. I think I confused something in the back of my mind here, nevermind then :-)

Jovan Gerbscheid (Mar 26 2024 at 13:02):

What is interesting, is that for structural recursive definitions, there is this thing called smart unfolding, which automatically unfolds the match expressions. The result if that for

def f : Nat → Nat
  | 0 => 0
  | n + 1 => f n
termination_by i => i

My code always finds the same definition, match n with | 0 => 0 | Nat.succ n => f n, but when I drop the termination_by from the definition, then because I use unfoldDefinition?, it only succeeds when it can successfully execute the match with 0 or Nat.succ.

Jovan Gerbscheid (Mar 26 2024 at 13:04):

image.png

This is in the case without the termination_by.

Jovan Gerbscheid (Mar 26 2024 at 13:09):

This is actually really weird, it also doesn't do smart unfolding on non-recursive definitions:

image.png

Jannis Limperg (Mar 26 2024 at 13:16):

That's intended; smart unfolding equations are not generated by default for non-recursive defs.

Jovan Gerbscheid (Mar 26 2024 at 13:18):

Is it also intended that the presence of a smart unfolding equation causes some expressions to not unfold even though they would unfold with normal unfolding?

Jannis Limperg (Mar 26 2024 at 13:52):

That's a question for Leo I'm afraid.

Patrick Massot (Mar 26 2024 at 14:36):

Jovan, I don’t remember if I already wrote this, but it would be nice to make sure that you keep a clean separation between the interface and all the heavy lifting code in your work. There may be other uses of the back-end work.

Jovan Gerbscheid (Mar 28 2024 at 13:54):

Thanks Patrick, I have now rewritten the code, so that there is a function getRewrites which returns all of the applicable rewrites (the heavy lifting part). And then this is used by the front-end functions.

I also added a filter to the unfolding section, forbidding it from showing unfolds that contain a constant satisfying Name.isInternalDetail. This removes most of the messy unfolds that we don't want.

I also realized that I should apply whnfCore after each unfold. So now I think there is no visible difference anymore between smart unfolding and normal unfolding. This also for example automatically unfolds if-else statements.

Jovan Gerbscheid (Mar 29 2024 at 13:50):

It is a bit tricky to make a test file for the rw?? tactic, because the tests depend on which expression is selected in the infoview. Is it required/important to have a test file, and if so, how would I do it?

Jovan Gerbscheid (Mar 29 2024 at 13:52):

Also, would it be good to have a command like #rw?? that allows you to see the rewrite suggestions for an arbitrary expression?

Patrick Massot (Mar 29 2024 at 15:39):

Last time I asked @Wojciech Nawrocki there was no way to write tests for the widget part. But you can definitely write tests for the heavy-lifting part. And having a #rw?? is one way of doing it easily.

Patrick Massot (Mar 29 2024 at 15:40):

By the way, do we need to keep rw? if we get rw??. It would be more consistent to only get the powerful one.

Jovan Gerbscheid (Mar 30 2024 at 00:23):

Yeah, replacing rw? seems like a good idea.

Jovan Gerbscheid (Mar 30 2024 at 00:26):

I made another change to the unfolding, so that it doesn't show unfoldings of default_instance instances projections. In particular this means that for functions like HAdd.hAdd that unfold to Add.add, I don't show the Add.add unfolding. I don't want to show these because those type classes are purely notational and don't have significant meaning.

Jovan Gerbscheid (Mar 30 2024 at 00:33):

I noticed this very slow instance synthesis that slows down rw?? at the lemma HasSum.mul_eq:

failed to synthesize
  T3Space ℂ
(deterministic) timeout at 'typeclass', maximum number of heartbeats (20000) has been reached (use 'set_option synthInstance.maxHeartbeats <num>' to set the limit)
-/

But setting the heartbeat limit to 21000 makes it succeed, after 600ms.

Jovan Gerbscheid (Mar 30 2024 at 15:53):

This lemma HasSum.mul_eq does the rewrite s * t = u (with a bunch of hypotheses). This doesn't seem to me like something anyone would ever want to rewrite with, so I now made it so that lemmas of the form a = b where b is a variable that doesn't appear in a, are not indexed in the discrimination tree. This saves a lot of time when rewriting with the pattern x + y : Complex.

Jovan Gerbscheid (Mar 30 2024 at 15:55):

I've also implemented the #rw?? command. This view is rendered as a MessageData, so that the output can be put in a logInfo. I think it is less clear than the original view.

image.png

Patrick Massot (Mar 30 2024 at 16:54):

This is mostly a debugging and testing command, right? So having a less clear about is not too bad.

Jovan Gerbscheid (Mar 31 2024 at 20:07):

I mentioned earlier that I didn't like that detecting duplicate rewrites is based on the printed string (in fact, setting pp.explicit to true could change the behaviour). I figured out now how to do this in a more principled way: I define isExplicitEq which gives that two expressions are equal if the explicit parts are equal, and the implicit parts are isDefEq. So if two expressions are only different in an implicit or an instance argument and that part is definitionally the same, we still count them as equal.

Jovan Gerbscheid (Mar 31 2024 at 23:21):

A funny consequence of this is that when rewriting 42 : Real, there are 5 lemmas that turn this into ↑42, and it's quite tricky to tell the difference because the replacement expression doesn't have hover information (because it is a button). 4 of these correspond to the inner 42 being a Nat, Int, Rat or Real, and the last one has the inner 42 being nat_lit 42 (so it is not an application of OfNat.ofNat). That last one is also suggested by the unfolding section, but rw [show 42 = ↑42 from rfl] doesn't work there; you would need to write rw [show (42:Real) = Nat.cast (nat_lit 42) from rfl].

Jovan Gerbscheid (Mar 31 2024 at 23:36):

I'm now also removing lemmas from Init.Omega in addition to Mathlib.Tactic.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Apr 01 2024 at 03:31):

My suggestion would be to instead provide some kind of extensible mechanism for users to mark which kinds of lemmas they do not want used; it is not very scalable to hardcode it in the tactic implementation. Attributes, environment extensions, and options (as in set_option) are some ways to provide configurability.

Jovan Gerbscheid (Apr 01 2024 at 11:15):

Do you imagine this marking to happen in the library, saying which lemmas should be avoided by rw??, or by the user that is using rw?? with a set_option? Or both?

Jovan Gerbscheid (Apr 01 2024 at 11:41):

The configuration is in principle controlled by a function of type (moduleName declName : Name) -> Bool (possibly in the MetaM monad). I suppose one way to have this configurable would be to have two tags, add and remove, which allow you to specify which lemmas you want and which you don't want with a function of the above type. And then the more recently tagged functions overwrite the less recent ones. So in this case there would be a tagged function excluding stuff in Mathlib.Tactic and one excluding stuff in Init.Omega.

Jovan Gerbscheid (Apr 01 2024 at 21:13):

The easiest for now would be to just have an extensible list of prefixes for modules from which lemmas are ignored, for now being Init.Omega and Mathlib.Tactic. This is also nice for building the discrimination tree, as we can just skip all declarations in these modules. I also think it is fair to expect a good library to separate lemmas that are part of the API from lemmas that aren't supposed to be seen by users.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Apr 02 2024 at 01:05):

As far as implementing something goes, an option would be easiest for you, I think. It is just one call to register_option. Though there is no Array datatype for options, so it'd have to be a string with a list of modules that should be excluded. An option is 'global' in some sense since one has to list all the modules they want excluded rather than appending to some ambient list, but it'd work.

I also think it is fair to expect a good library to separate lemmas that are part of the API from lemmas that aren't supposed to be seen by users.

This is true, but it is a problem with Lean ~~(and dependent type theory)~~ in general, not with this widget: there is no proper abstraction/encapsulation mechanism. Certainly if there were a list of namespaces whose contents are exported (because they have to, due to the abstraction issues), but which shouldn't be shown by default in rw?, apply?, exact? and so on (e.g. because they are implementation details of something), all those tactics might want to use it.

Mario Carneiro (Apr 02 2024 at 01:07):

Lean (and dependent type theory) in general, not with this widget: there is no proper abstraction/encapsulation mechanism

I think Coq users would beg to differ (c.f. "modules" from ML / Coq)

Jovan Gerbscheid (Apr 02 2024 at 01:41):

What do you mean with abstraction/encapsulation?

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Apr 02 2024 at 02:31):

Mario Carneiro said:

Lean (and dependent type theory) in general, not with this widget: there is no proper abstraction/encapsulation mechanism

I think Coq users would beg to differ (c.f. "modules" from ML / Coq)

You're right, I shouldn't make claims about languages I am not very familiar with; edited. Do you know how/if Coq modules account for the fact that despite signature-based encapsulation, users might still want to peek inside the definition of a function in order to prove something about it? Is it just forbidden? (FWIW, I think the right choice is to forbid that and provide a type signature sufficiently strong to imply any sensible extensional property (so in particular, not contingent properties of a specific implementation)).

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Apr 02 2024 at 02:34):

Furthermore do you know how/if Coq modules account for definitions that must be exposed to have dependent type-checking of an implementation go through? The canonical example is that with zeros n : vec n, zeros n : vec (n+0) needs to look inside (+), and so that particular definition cannot remain fully abstract (c.f. controlling unfolding))?

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Apr 02 2024 at 02:45):

I think the right choice is to forbid that and provide a type signature sufficiently strong to imply any sensible extensional property

Expanding a bit, I only suggest this for CS-y algorithm implementations; e.g., if you have an implementation of a queue, you probably want to prove that it is, in fact, a queue, and leave the rest behind an abstract signature. On the other hand, a mathematical function one may wish to leave unfoldable donwstream (assuming one wants access to defeqs). I acknowledge the distinction is not always well-drawn.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Apr 02 2024 at 03:06):

Jovan Gerbscheid said:

What do you mean with abstraction/encapsulation?

The introduction to the 'unfolding' paper linked above covers some of the modularity issues plaguing DTT. There were once rumors of work on a module system for Lean; I am not sure what the status of that is. Such features are desirable to provide statically checkable abstraction boundaries. Among other things, abstraction can be used to inform downstream consumers of a package which definitions they can freely use, and which ones are implementation details that may break at any time.

But in the case of Init.Omega we just discussed, it's a simpler thing. The issue is that as soon as you import omega, every definition in the implementation (e.g. this stuff) is imported wholesale into your environment, and the implementation details are not marked as such in any way. This is exactly why you end up needing an option/attribute/whatever to figure that out. Namespaces that are not open by default make this a non-issue for human-written proofs (because why would you open Lean.Omega?), but automation (i.e., tactics, like rw??) will go through the entire environment including Lean.Omega. There technically is a mechanism to prevent that: you can annotate definitions with private. It used to be the case that private definitions could not be made un-private downstream (e.g. if you do actually need to use some internal function from an imported module, and are okay with it breaking under you on version bumps) and people would complain, but now there is open private so that it's a non-issue. Nevertheless, the private mechanism is not really a first-class feature. For instance, afaict there is no private section or private namespace. So the omega implementers would have to mark every definition private, which is rather annoying.

𝚠𝚘𝚓𝚌𝚒𝚎𝚌𝚑 𝚗𝚊𝚠𝚛𝚘𝚌𝚔𝚒 (Apr 02 2024 at 03:10):

And to balance out the complaining somewhat, the usual disclaimer: this stuff is difficult, and I acknowledge the hard work the Lean design team is doing :)

Jovan Gerbscheid (Apr 03 2024 at 14:21):

I've implemented the option for excluding modules. The default value is "Init.Omega Mathlib.Tactic".
For example this works:

set_option librarySearch.excludedModules "Mathlib"
variable (n : Nat)
/-
info: Pattern n + 1
· Nat.succ n
  Nat.add_one
· Int.toNat (↑n + 1)
  Int.toNat_ofNat_add_one

Pattern n + m
· 1 + n
  Nat.add_comm
· Nat.add n 1
  Nat.add_eq
-/
#rw?? n+1

Jovan Gerbscheid (Apr 03 2024 at 14:23):

I've also added some trace messages, so now, using trace.profiler, you can see which lemmas were slow to check.

Jovan Gerbscheid (Apr 03 2024 at 14:29):

Currently whenever an error (such as a max heartbeat error) happens when checking a lemma, it is caught and not propagated anywhere. Would it be useful to show such errors somehow? For example in the place where the lemma would have been, it could say

An error occurred while checking the lemma <lemma name>:
<the error>

Jovan Gerbscheid (Apr 03 2024 at 14:34):

I also wonder if there is a nice way to display a lemma name with hover information. The problem is that when printing it as an expression, it gets the @ symbol prepended if it has implicit arguments, which I don't want. The way I currently circumvent this is very ugly: I first compute it with the @ symbol, and then I try to replace the text with just the lemma name.

Jovan Gerbscheid (Apr 04 2024 at 11:36):

A while ago I switched to doing unification with rewrite lemmas in the reducible transparency setting, which is quite a bit faster than default. However, this misses some lemmas that could otherwise unify, because of the fact that RefinedDiscrTree indexes fun x => f x + g x the same as f + g (this only unifies in at least the instances transparency) and indexes fun x => x the same as id (this only unifies in at least the default transparency), whenever these appear as strict subexpressions. This only comes up when you have a higher order function/predicate such as Continuous, but that is quite rare when rewriting, so I think this is no big deal.

At some point I'll write a version of isDefEq that uses the RefinedDiscrTree normal form. This would correctly handle the above examples and be at least as efficient as isDefEq with reducible transparency.

Jovan Gerbscheid (Apr 05 2024 at 13:02):

Some things that could be added are:

rewriting with local hypotheses. I'm not so sure how useful this is because I think in that case people would prefer to use rw directly.
trying to close new goals using a discharger, like solve_by_elim. I'm not sure if this is worth it as it will probably make the tactic a bit slower, and in most cases we rewrite with lemmas that don't generate new goals anyways.

Jovan Gerbscheid (Apr 05 2024 at 13:03):

If there is nothing more to add, I'll mark the PR as awaiting-review.

Patrick Massot (Apr 05 2024 at 16:44):

I think the first point would be to have since it still helps with cases where the pattern appears in several places and you want to rewrite only once, right?

Jovan Gerbscheid (Apr 05 2024 at 22:40):

That's true, although the occurrence syntax for rewriting isn't that hard to use directly.

If I do add it, I think the most natural way is to have separate sections at the start for the local hypothesis rewrites, instead of putting these in between the other library rewrites. I also do this with the lemmas from the current file, putting them in separate sections before the rest.

Jovan Gerbscheid (Apr 06 2024 at 00:53):

While implementing this, I found out that the lemma that is currently being proved is also available as a local hypothesis (so it can be used recursively). I presume we do not want to suggest this as a rewrite?

Jovan Gerbscheid (Apr 06 2024 at 13:54):

image.png

I've implemented this now.

When searching for lemmas in the current file, or in the local hypotheses, I first compute the respective discrimination trees, and then lookup as usual. This is a bit of a funny use of a discrimination tree as I'm building it for one time use, but I think it is efficient enough that this is no problem.

Jovan Gerbscheid (Apr 06 2024 at 14:15):

Something else I was wondering is whether it is desired to have rewrite results that have new metavariables in the replacement expression. Currently I only exclude rewrites where the replacement is a new metavariable.

Jovan Gerbscheid (Apr 06 2024 at 14:24):

Another thing is instantiating metavariables through rewriting. I haven't implemented this so far since it is an uncommon situation, but the discrimination tree allows for this kind of lookup. Do you think this is desirable? One disadvantage could be that when you click on an expression with a metavariable in it and you don't want to instantiate it, then the result gets cluttered with rewrites that do instantiations that you don't want. Although this could be solved by sorting it so that the instantiating rewrites are shown last.

Richard Osborn (Apr 06 2024 at 14:30):

It seems like the tactic is basically ready for use? Is it possible to start reviewing the code for merging into mathlib? I figure people will have a better idea of features they would like added once they've had some time using the tactic.

Jovan Gerbscheid (Apr 06 2024 at 14:31):

Yes, I'll put it on awaiting-review

Patrick Massot (Apr 06 2024 at 14:50):

I agree that it is probably in a state where getting actual user feedback could be useful to make the final decisions.

Patrick Massot (Apr 06 2024 at 14:55):

But you will probably need to split the PR to get reviews. There are at least three pieces: the refined discrimination tree, the unfold widget and the rewriting stuff. Maybe you could even split the rewriting stuff into the backend and the widget because they probably require different people reviewing the code.

Jovan Gerbscheid (Apr 06 2024 at 18:07):

How does splitting a PR work?

Jovan Gerbscheid (Apr 06 2024 at 21:45):

Wojciech Nawrocki said:

If the widget doesn't depend on anything mathlib-specific, you could PR it there and I would review the PR.

I've PR-ed the TypeScript now in Proofwidgets.

Patrick Massot (Apr 06 2024 at 22:31):

Jovan Gerbscheid said:

How does splitting a PR work?

You simply open new PRs with part of the code.

Jovan Gerbscheid (Apr 06 2024 at 22:32):

But the rewrite code depends on the discrimination tree code, how do I use the code from one PR in the other?

Kim Morrison (Apr 06 2024 at 22:33):

The PRs will be stacked; i.e. one will contain a superset of the changes of the other.

Kim Morrison (Apr 06 2024 at 22:33):

Alternatively, just make the discrtree PR first, and keep the rest in a private branch.

Jovan Gerbscheid (Apr 06 2024 at 22:50):

Is it really a good idea to split into a bunch of different PR's? Because then it will get much harder to keep track of changes and also to test them. If someone wants to review the discrimination tree part, they can just look in that file in the current PR, right?

Patrick Massot (Apr 06 2024 at 23:11):

Experience proves that splitting works a lot better. You will be very frustrated otherwise because most likely nothing will happen if you stick to your giant PR that requires many different reviewer skills. Since I think this is really important work, I think you should split.

Jovan Gerbscheid (Apr 06 2024 at 23:55):

Ok, here is the separate PR for RefinedDiscrTree: #11968

Kevin Buzzard (Apr 07 2024 at 00:17):

(in case you didn't spot it, CI has failed)

Jovan Gerbscheid (Apr 07 2024 at 00:29):

It warns about unused imports, and suggests some other imports. But these suggestions are nonsense

warning: import #[Lean.Meta.Tactic.Simp.SimpCongrTheorems, Lean.Meta.Tactic.IndependentOf] instead

I've replaced it with the actual minimal imports, hopefully it succeeds now.

Mario Carneiro (Apr 07 2024 at 00:33):

can you be more specific?

Jovan Gerbscheid (Apr 07 2024 at 00:36):

The full two warnings:

Warning: ./././Mathlib/Lean/Meta/RefinedDiscrTree.lean:6:1: warning: unused import (use `lake exe shake --fix` to fix this, or `lake exe shake --update` to ignore)
Warning: ./././Mathlib/Lean/Meta/RefinedDiscrTree.lean:8:1: warning: import #[Lean.Meta.Tactic.Simp.SimpCongrTheorems, Lean.Meta.Tactic.IndependentOf] instead

My code doesn't contain anything related to simp or the independentOf tactic, so it is weird that it suggests importing these files.

Jovan Gerbscheid (Apr 07 2024 at 00:36):

What I did to minimize the import is replace import Lean.Meta with import Lean.Meta.DiscrTree

Mario Carneiro (Apr 07 2024 at 01:24):

I added an --explain mode to shake to explain where the references are coming from. For that commit, it says:

  note: Mathlib.Lean.Meta.RefinedDiscrTree requires Lean.Meta.Tactic.Simp.SimpCongrTheorems
    because Std.Range.forIn.loop._at.Lean.Meta.RefinedDiscrTree.MkDTExpr.getIgnores._spec_1._lambda_1._cstage2
    refers to Lean.throwError._at.Lean.Meta.mkSimpCongrTheorem._spec_4
  note: Mathlib.Lean.Meta.RefinedDiscrTree requires Lean.Meta.Tactic.IndependentOf
    because Lean.Meta.RefinedDiscrTree.DTExpr.eqv.go._cstage2
    refers to Lean.HashSetImp.contains._at.Lean.MVarId.isIndependentOf._spec_1

Mario Carneiro (Apr 07 2024 at 01:25):

In other words, lean is adding spurious dependencies because certain specializations of some functions have been precompiled upstream somewhere

Jovan Gerbscheid (Apr 07 2024 at 01:39):

That's interesting. I presume we don't want such imports? I think they would be confusing. And they aren't suggested anymore now that I've minimized the import.

Mario Carneiro (Apr 07 2024 at 06:52):

#11981

Mario Carneiro (Apr 07 2024 at 06:55):

Which won't compile because it (predictably) uncovers some new unnecessary imports. Can I interest someone in taking over the PR and fixing the imports? (@Yaël Dillies perhaps?)

Yaël Dillies (Apr 07 2024 at 06:58):

Currently in an airplane but I can take a look in a few hours

Ruben Van de Velde (Apr 07 2024 at 08:42):

I pushed the suggestions, hope they work :)

Ruben Van de Velde (Apr 07 2024 at 08:48):

They didn't

Jovan Gerbscheid (Apr 07 2024 at 09:28):

By the way, in the new update to vscode, they've added a feature where section titles are shown in the minimap (https://code.visualstudio.com/updates/v1_88#_minimap-section-headers). I think it would be amazing to have something like this in Lean. I have some titles in my code using the suggested /-! ### My title -/ form, but they don't really help because they don't stand out from the rest of the file in any way on the minimap.

Jovan Gerbscheid (Apr 08 2024 at 17:01):

I've now also made a separate PR for the unfold? tactic: #12016
This one is quite a bit smaller, so should be easier to review.

Jovan Gerbscheid (Apr 16 2024 at 20:25):

An issue I hadn't realized before is that sometimes rw fails with an error saying "motive is not type correct". This is currently not detected by rw??. So whenever the motive is not type correct it still shows the library rewrites and these can still be pasted into the editor, but then give this error. I could detect this, so that instead of showing the library rewrites, it explains why this particular subexpression cannot be rewritten. Is this desired, or do we still want to see the library rewrites, even if rw doesn't work there? Or a warning message, followed by the library rewrites?

Here's an example:

image.png

Kim Morrison (Apr 17 2024 at 03:05):

I think we just don't want to see those results at all.

Jovan Gerbscheid (Apr 17 2024 at 09:46):

I've fixed it now. Instead of showing the results, it says
The selected expression cannot be rewritten, because the motive is not type correct. This usually occurs when trying to rewrite a term that appears as a dependent argument.

Jovan Gerbscheid (Apr 17 2024 at 09:52):

I've also done this for the unfolding part of the tactic, as this uses rw [show current = replacement from rfl] to do the replacement. However, even when the motive is not type correct, replacing something by something definitionally equal still is valid. If this is enough of a problem, we could make a rw_defEq tactic that takes as arguments current and replacement, and does the rewrite assuming current and replacement are defEq

Jovan Gerbscheid (Apr 18 2024 at 18:19):

I've made another little fix: If you try to rewrite Nat.Coprime 2 3, there used to be a section of results labelled n = 1, because Nat.Coprime is a reducible constant reducing to Nat.gcd m n = 1. But these lemma's don't actually work with rw, because in addition to definitional equality with reducible transparency, rw requires the HeadIndex and the number of arguments of the head to match exactly. So now I also check that these match.

Jovan Gerbscheid (Apr 29 2024 at 22:58):

I recently reimplemented how the RefinedDiscrTree handles Pi type instances. Now it does this for the following operations: +, *, -, /, ⁻¹, +ᵥ, • ,^. I thought that I had done all the relevant pi type instances. However I found that there is also the alternative instance Pi.smul'. And there is Pi.sdiff for set difference.

Is there a complete list of these Pi type instances somewhere, and are they common enough that they will appear in someone's library search?

This is not so relevant for rewriting, but more relevant to something like apply?, which I will want to make in the future using this discrimination tree, and for funprop, which uses this discrimination tree.

Jovan Gerbscheid (Apr 30 2024 at 01:27):

There are also types such as PiLp that are a copy of a Pi type and that get a copy of the standard Pi type instance on some of these operations. These things are not handled separately by my discrimination tree, but I think that's fine.

Jovan Gerbscheid (Apr 30 2024 at 01:32):

There's also Pi.instNatCast which I deal with by stripping away any lambdas that appear in front of number literals. I don't reduce over-applications of number literals, since it seems very unlikely to me that anyone would leave those in an expression anyways.

Jon Eugster (Jun 25 2024 at 13:36):

@Jovan Gerbscheid I started looking at your PR's surrounding #11768. As far as I can tell the only things your tactic uses from Mathlib are some widget utilities in Mathlib.Tactic.Widget.SelectSidePanelUtil.

So I was wondering, what's the motivation to add it to mathlib instead of providing it as a standalone package which is imported in mathlib, similar to e.g. aesop? Maybe the standalone approach would be easier to maintain?

Jovan Gerbscheid (Jun 25 2024 at 13:55):

I simply hadn't considered the standalone package idea, so I also don't know if that's easier.

Another way in which it depends on Mathlib is via the discrimination tree, where the code deals with some pi instances, and the Inv.inv constant. But it doesn't import these constants, because it just contains the constants as a Name.

If you think the stand alone package approach is better, I could try that.

Jovan Gerbscheid (Jun 25 2024 at 14:00):

Something else I was considering was to do what the LazyDiscrTree does in my own discrimination tree. The original reason I rejected this was because building the lazy discrimination tree from scratch takes too long. But this starting tree could be cached via the same caching mechanism, making the cached data much smaller, and easier to extend with further imports.

Jon Eugster (Jun 25 2024 at 14:19):

I don't know if it's necessarily better. I just know that it's a lot of code and I understand why it hasn't been reviewed in quite some time. And I was just wondering if there would be repeatedly long review times whenever there are additions/changes. Having it separate might increase the maintainability, but you'd know better about the technical details.

Simply wanted to ask first before looking at it in more detail :)

Jon Eugster (Jun 25 2024 at 14:34):

Oh, I just noticed that the main content of #11968 (or an older version thereof?) already made it into mathlib as Mathlib.Tactic.FunProp.RefinedDiscrTree

Jovan Gerbscheid (Jun 25 2024 at 22:36):

That sounds reasonable.

Something that would have to be figured out is how to set up mathlib CI to build the cache (I don't really know how it works I just copied what rw? did)

Jovan Gerbscheid (Jun 25 2024 at 22:36):

Yes fun_prop uses an old version of my refined discrimination tree

Kim Morrison (Jun 25 2024 at 23:54):

@Jovan Gerbscheid, maybe I'm misunderstanding here, but Mathlib no longer builds a cache for exact? or rw?, and we do not want to go back to doing this sort of thing.

Kim Morrison (Jun 25 2024 at 23:54):

(so if you copied what Mathlib's rw? did in the past, please don't!)

Jovan Gerbscheid (Jun 26 2024 at 06:55):

Ah, I see. I indeed copied the old behaviour of rw?.

Kim Morrison (Jun 26 2024 at 06:58):

The new behaviours (up in the lean4) repo is very different, and doesn't require any caching, but is quite complicated! (i.e. I hope I don't have to touch it :-)

Jovan Gerbscheid (Jul 02 2024 at 17:41):

I'm wondering - would it be beneficial to use LazyDiscrTree instead of DiscrTree for things like type class search, and simp? I would guess that this could speed up imports. Or is the amount of instances and simp lemmas sufficiently small that this is negligible?

Jovan Gerbscheid (Sep 12 2024 at 11:49):

I have now rewritten the RefinedDiscrTree implementation based on LazyDiscrTree. With this, rw?? can be used without any pre built cache, and works with any import you want. I carefully optimized the initialization of the cache, so even though the code isn't precompiled, the once per file cost of building the cache is about 1.5 seconds when importing Mathlib.

So #11968 and #11768 are ready for review again. (cc @Jon Eugster)

Jovan Gerbscheid (Sep 12 2024 at 11:53):

Here is an example of what the now lazy RefinedDiscrTree looks like with the following lemmas:

axiom foo1 (f : Rat → Rat) : Continuous (fun x : Rat => (f x)⁻¹) = True
axiom foo2                 : Continuous (fun x : Rat => x⁻¹) = True
axiom foo3 (c : Rat)       : Continuous (fun _ : Rat => c) = True
axiom foo4                 : Continuous (fun x y : Rat => x + y) = True

after looking up #rw?? Continuous (Inv.inv : Rat → Rat):

Discrimination tree flowchart:
⟨Continuous, 5⟩ =>
  ⟨Rat, 0⟩ =>
    ⟨Rat, 0⟩ =>
      *0 =>
        *1 =>
          *2 =>
            entries: #[foo3]
          ⟨Inv.inv, 3⟩ =>
            ∀ =>
              ⟨Rat, 0⟩ =>
                ⟨Rat, 0⟩ =>
                  *2 =>
                    *3 =>
                      entries: #[foo1]
                    ⟨id, 1⟩ =>
                      ⟨Rat, 0⟩ =>
                        entries: #[foo2]
    ∀ =>
      pending entries: #[foo4]
⟨True, 0⟩ =>
  pending entries: #[← foo4, ← foo3, ← foo1, ← foo2]

The nodes marked as entries (and not pending entries) are the ones that have been evaluated and returned during the lookup.

Jon Eugster (Sep 12 2024 at 12:00):

cool! I might only be able to review next week again (if nobody else has time before that) but I have your PR on my TODO list :)

Jovan Gerbscheid (Oct 24 2024 at 14:14):

@Jon Eugster, Have you managed to take a look at it yet?

Jon Eugster (Oct 27 2024 at 18:54):

I'm sorry I never finished due to the size of the PR and now I've been on a month of leave, but I'll try to finish reading the PR soon

Jovan Gerbscheid (Dec 11 2024 at 20:15):

@Jon Eugster, I've now updated the PR for the new version of Lean, and did some little improvements

Kim Morrison (Jan 21 2025 at 02:44):

Just :ping_pong: @Jovan Gerbscheid for status on this one. I know we have been extremely slow reviewing this, but it's very nice functionality! Currently there are open review comments from @Eric Wieser, and merge conflicts.

Jovan Gerbscheid (Jan 21 2025 at 17:36):

Thanks for the reminder, I've now implemented/reacted to the comments

Last updated: May 02 2025 at 03:31 UTC