Zulip Chat Archive

Stream: general

Topic: Tracking hypotheses across tactics

Jannis Limperg (Jun 01 2020 at 20:44):

I'm writing a tactic that iterates over a set of hypotheses hs. At each step, the tactic takes the first hypothesis out of hs, processes it and returns a modified hs' as input for the next step.

This all works fine except when a processing step changes the unique names of hypotheses in the context, as all revert-based tactics do. This invalidates hs, which now contains local constants that don't exist any more. The revert-based tactics also don't report which unique names were changed, so I can't easily update hs. Is there any established way to deal with this?

Until now, I've muddled through by using pretty names to identify hypotheses, making sure that they are unique and that I update hs whenever any of the unique names may have changed. But that's very error-prone and would be quite painful for my current application.

Jason Rute (Jun 01 2020 at 22:38):

I think I'm still unclear on your algorithm. You take a hypothesis, h, perform some tactic with it, and then relabel the hypothesis as "handled", by changing the name, correct? The issue is that some of the tactics you perform could change the hypothesis in the goal. Without knowing more (and without being an expert on this sort of thing), I could see a few approaches:

  • Gather all the hypothesis into a list, and then looping over that list, applying each to the goal using your method. If the hypotheses in the goal change that is fine (as long as you don't need to "handle" any new hypotheses that arise in that loop).
  • Using the hash of the local variable (or maybe the hash of it's type?) to keep track of the hypotheses you have processes. Keep them in some data structure (I think Lean has red-black trees which work like immutable sets). Then you can after each tactic application update that set to include the hypothesis you processed.

Jason Rute (Jun 01 2020 at 22:39):

I'm not a tactic writing expert, so I hope others chime in.

Jason Rute (Jun 01 2020 at 22:41):

For my second suggestion, I'd look into Lean's expr_set, although I've never used it personally.

Mario Carneiro (Jun 01 2020 at 22:49):

There is no very good solution to this in existing tactics. rcases has to deal with this because it has to sometimes do cases on an earlier argument that changes a later argument that must itself be cased, and lookup fails in that case. Currently I'm doing some heuristic matching using the names of the hypotheses but I would like to have something more robust here.

Mario Carneiro (Jun 01 2020 at 22:52):

Probably the most robust approach would be to modify revert to return a map of changed unique names that can be applied to any later data structures used in tactics (like rcases's todo list)

Mario Carneiro (Jun 01 2020 at 22:55):

well, to be more precise revert would return a list of "stolen" unique names (that have now become de Bruijn variables), and the un-revert tactic (intron), used after whatever hypothesis tactic is done doing its thing, would take this list and zip it with the newly created local constants.

Mario Carneiro (Jun 01 2020 at 22:56):

However, this requires some changes to C++ because cases_core doesn't provide this level of detail IIRC

Jannis Limperg (Jun 01 2020 at 23:02):

Jason, there are solid ideas, but I don't think they'll work out:

  • Even if we just loop through a list of hypotheses, any iteration of the loop can change the unique names of hypotheses we have yet to process (unless we specifically design the loop so that this can't happen). Example:
example (x y : ) (h₁ : x = y) (h₂ : y = y) : true :=
  (do h₁  get_local `h₁,
      h₂  get_local `h₂,
      [h₁, h₂].mmap subst

This will fail with "unknown hypothesis h₂" because after subst h₁, h₂ is a different local constant with, in particular, a different unique name.

  • Tracking hyps by type + pretty name would be a better heuristic than tracking them by pretty name only. Still, there's nothing stopping anyone from having two hypotheses with the same name and type in the context.

Mario, thank you even though you're confirming my fears. ;) I was really hoping you'd come along with a workaround. I'll look into what would need to change to track this information, but that sounds like a major operation.

Mario Carneiro (Jun 01 2020 at 23:14):

There is another problem with the revert/intro approach to hypothesis tracking: It is possible that a hypothesis tactic can change the goal in more radical ways, for example if you want to simp at h and so you revert h : true |- false to |- true -> false, and then simp is clever and simplifies the new goal to false, and then your intron 1 will fail. I'm not clear on what all the tactics are doing to avoid this problem

Mario Carneiro (Jun 01 2020 at 23:16):

We are sort of working on the assumption that the hypothesis tactic doesn't mangle the pis that have been introduced in the goal. If they get reordered or simplified to something other than a pi then it will be very hard to maintain identity of hypotheses across this point

Jannis Limperg (Jun 01 2020 at 23:29):

Seems like simp at sidesteps this problem by not doing the revert/intro dance. Instead, the primitive simplify takes an expression and returns a simplified expression along with a proof that they're equal. Unfortunately, these are then fed into replace_hyp, which again messes up the unique names.

In general, I feel like it's okay to demand that the tactic between a revert and an intron doesn't change the number/order of binders -- this seems like it could usually be arranged.

Jannis Limperg (Jun 01 2020 at 23:35):

(Note to self: The sketched design where revert returns a list of unique names and intron takes that list again is unsafe in that it allows us to get different hyps with the same unique name. I imagine Lean wouldn't be too happy about that. However, we could have a safer primitive, reverting, which acts as a bracketing operator. That's also less powerful, though.)

Mario Carneiro (Jun 01 2020 at 23:44):

Note that it is impossible to not mess up unique names here. A hypothesis with a given unique name can only have one type. If you are replacing it with a different hypothesis, it has to have a different unique name, and similarly for any dependent hypotheses. For unaffected (non-dependent) hypotheses coming after the one being rewritten, it is possible to avoid changing their unique names by not reverting them, and I think usually revert only reverts the minimum number of hypotheses necessary. But this does result in reordering of hypotheses.

Mario Carneiro (Jun 01 2020 at 23:47):

The sketched design where revert returns a list of unique names and intron takes that list again is unsafe in that it allows us to get different hyps with the same unique name.

My suggestion was that revert returns a list like [h1, h2, h3] with unique names of now reverted hypotheses, and intron would take this list as input and produce the list/map [(h1, h1'), (h2, h2'), (h3, h3')] containing the newly allocated unique names for the introduced variables. It would not reuse the old unique name as that could things to break (I'm not exactly sure what, though)

Jannis Limperg (Jun 01 2020 at 23:55):

I see. If the unique names are tied to types, we obviously can't hope to keep them stable. Carrying around a map of renamings then does seem like the next best thing. For revert/intro, I believe this could already be implemented in user space. Any primitives that operate on hypotheses would have to be modified to return such a map, as you say. And then this map would have to be passed around and updated by everything. (That sounds like it might be a job for a wrapper monad.)

Mario Carneiro (Jun 02 2020 at 00:56):

I was rather thinking that the map would be applied to any continuation data used by the tactic (like the remainder of hypotheses being processed) and discarded, rather than building up a collection of renaming substitutions that itself requires maintenance and filters all future tactic calls to things like get_local

Mario Carneiro (Jun 02 2020 at 01:01):

if the continuation data is buried in the call stack then this is more complicated, but I think you can still return a name update map that is the composition of all name updates that have happened so far so that you can apply it to any local data. (That said this is basically a wrapper monad.) If you have an explicit continuation then you don't need to do this.

Jannis Limperg (Jun 02 2020 at 02:00):

Fair. I meant that any tactics built on revert/intro or similar tricks would have to pass on the renaming map. This would require a fairly large number of 'plumbing' changes.

@Sebastian Ullrich Does Lean 4 have any facility along these lines that we could copy to minimise divergence?

Last updated: Dec 20 2023 at 11:08 UTC