Zulip Chat Archive

Stream: general

Topic: Case tags


Jannis Limperg (Apr 21 2020 at 17:08):

I'm currently trying to understand how case tags work, and they are rather more complicated than expected. There seem to be two ways to tag a goal as a 'case', _case.n (where n is a natural number; what's this for?) and _case_simple. Both seem to be in use. When the case tactic tries to find a case that matches the name given by the user, it inspects the term constructed so far in some unholy fashion -- but only for _case_simple tags. Does anyone know why all this is necessary? I would have expected case to just look for a goal with a matching name.

Sebastian Ullrich (Apr 21 2020 at 20:32):

The "interesting" code is from my original implementation of case, before Leo added any tag management. As the comment at the use site of find_case explains, it is not being used for actually locating the goal anymore. I don't know/remember much about the tags themselves.

Jannis Limperg (Apr 22 2020 at 14:18):

Thanks, that helps a lot! Perhaps a refactoring to remove _case_simple is in order.

Jannis Limperg (Apr 22 2020 at 15:40):

I think I broadly understand what's happening now. Documenting it here (for myself if noone else):

  • A tag is a list of names.
  • induction and cases (and variants) tag each goal they generate with the tag ['_case_simple, cname] where cname is the name of the constructor (fully qualified).
  • with_cases t runs the tactic t. Then, for any new goal g produced by t, it reverts any hypotheses of g that did not occur in the original goal. Finally, g's tag gets prefixed with '_case.n where n is the number of hypotheses that were reverted.
  • case name : x, y, ... first finds a goal whose tag matches name. It then proceeds according to whether the goal's tag starts with '_case.n or '_case_simple.
    • For '_case.n, it introduces n arguments, using the names x, y, ... (if available).
    • For '_case_simple, it determines the number of arguments of the constructor which generated this case. (This is what the linked find_case does, as far as I can tell.) It then renames these arguments to x, y, ... (if available).

Bryan Gin-ge Chen (Apr 22 2020 at 15:41):

A PR to add this info in the form of doc strings to leanprover-community/lean would be greatly appreciated!

Jannis Limperg (Apr 22 2020 at 15:49):

Can do, but at the moment I want to redesign the whole thing anyway. Besides being ugly, the find_case hack leads to such tight coupling between induction/cases and case that I can't use case tags with an alternative induction tactic that I'm working on.

Jannis Limperg (Apr 23 2020 at 00:31):

New design:

  • A case tag has the form ['_case, cname_0, ... cname_m, '_arg.x_0, ..., '_arg.x_n, '_end_case] where the cname_i are names associated with the case (usually just one constructor name) and the x_i are the unique names of those local hypotheses which were added for the case.
  • cases, induction and friends, when called on a goal with tag t, generate one goal per constructor and prepend an appropriate case tag to t.
  • case c : x y z... { t } searches all goals for one whose tag includes a case tag for c. (The search remains fuzzy, so case zero finds a goal tagged nat.zero.) If it finds exactly one such goal, it renames the _arg hypotheses to x, y, z, ... (if present) and runs t on the goal.
  • Some other tactics can also set tags if tags are enabled. As far as I can tell, these are always 'simple' tags, meaning the tag is just one or more names which are associated with the case. For example, apply can set tags for generated subgoals if the arguments that generate these subgoals are named in the applied lemma.
  • with_cases t runs t with tags enabled. For each new goal g, it then generates a case tag ['_case, cname_i, ..., '_arg.x_i, ..., '_end_case] where
    • the cname_i are the associated names of any case tags or simple tags of g;
    • the _arg.x_i are those hypotheses of g that did not appear in the original goal.
      This ensures that with_cases can be used both with cases/induction (though there isn't really any reason to do so any more) and with tactics like apply that generate simple tags. The latter is used occasionally in core, to apply custom induction principles.

This design should cover all current uses of case tags. It doesn't require any C++ changes. It is backwards-incompatible for people setting tags manually, but as far as I can see, there is only direct use of set_tag in all of Github (the wlog tactic in mathlib, which I can fix).

Any concerns?

Mario Carneiro (Apr 23 2020 at 00:42):

I don't like all this layered information in a list name. Why not just use an expr?

Jannis Limperg (Apr 23 2020 at 02:19):

I'd also prefer more structured data. However, that would require changes to the C++ (tags are part of the tactic state), which I'd rather avoid.

Mario Carneiro (Apr 23 2020 at 03:48):

I don't think the C++ needs to do much? Just associate an expr to the goal instead of a list name. I don't think these tags are used much on the C++ side

Jannis Limperg (Apr 24 2020 at 00:47):

An expr would be doable, yes. However, I don't understand what this would buy us. We'd still need to define which exprs are valid tags and what their semantics are, right? Could you maybe elaborate on the design you were thinking of?

Mario Carneiro (Apr 24 2020 at 00:58):

It leaves the door open for other uses of tags, like in refine_struct, without constraining the form of the metadata

Mario Carneiro (Apr 24 2020 at 01:00):

Yes, for case tags we would still need roughly the same structure, although I think the way it is currently done is a bit idiosyncratic. What should case tags for rcases look like, for instance?

Mario Carneiro (Apr 24 2020 at 01:02):

Also, the hypotheses pointing to unique names need to be organized in a way so that they can be updated without regard for the surrounding data structure, for things like revert

Jannis Limperg (Apr 24 2020 at 01:53):

I agree that it would be nice to have more flexibility in the metadata, to support other uses of tags. However, there needs to be some structure to the data because Lean needs to know how to print a tag. How about this: a tag is a list (name x list name x expr). The first component identifies what kind of tag this is, e.g. _case or _field. The second component gives names that should be displayed in the goal state (['c, 'd] becomes "case c, d"). The third component is arbitrary metadata. This certainly covers all current uses and imposes only minimal structure. The printing is still a bit restrictive, but I don't immediately see a way around that.

To support rcases, one would need to identify cases by a sequence of names (rather than one name): sum.inj/sum.inr could be one case if we split a nested sum. However, I'm not sure that this is worth the extra complexity.

I'm afraid I don't understand your remark about the unique names. How should revert interact with them?

Mario Carneiro (Apr 24 2020 at 02:07):

There are lots of tactics that use revert/intro to "rewrite in hypotheses", such as rw at h. This changes the unique name of the hypothesis, and the case tag needs to be updated to match. Even cases itself can cause this behavior, which causes some problems in rcases when it loses track of which hypotheses need to be case split. The information about names needs to be stored in such a way that it can be updated by tactics that don't care if it's a case tag or otherwise

Mario Carneiro (Apr 24 2020 at 02:08):

Why are there multiple constructor tags on a case tag if not for this?

Mario Carneiro (Apr 24 2020 at 02:09):

I think you can store the _case/_field marker in the head of the expr

Mario Carneiro (Apr 24 2020 at 02:10):

I don't know what the top level list is for in your design

Jannis Limperg (Apr 24 2020 at 16:54):

Mario Carneiro said:

There are lots of tactics that use revert/intro to "rewrite in hypotheses", such as rw at h. This changes the unique name of the hypothesis, and the case tag needs to be updated to match. Even cases itself can cause this behavior, which causes some problems in rcases when it loses track of which hypotheses need to be case split. The information about names needs to be stored in such a way that it can be updated by tactics that don't care if it's a case tag or otherwise

Okay, this throws quite a spanner into the works. Some ways we could deal with this:

  • Add an additional list of names to each tag containing unique names that the tag refers to. Any tactic that uses the revert/intro trick would then have to update these unique names accordingly. This would be the correct thing to do, but it sounds impractical to me. Afaik, most third-party tactics don't even propagate the current tags correctly, so we'd get a bunch of issues where case gets mysteriously broken by tactics that forgot to update the unique names.
  • Keep the current design where the case tag only says "the last N hypotheses were introduced by me". This works just as well in the common case where the induction/cases/... is immediately followed by a sequence of case .... It breaks more easily -- and in a way that's hard for users to understand -- if anything happens between the induction and the case, e.g. an intro (though situations like this can be handled by with_cases). That leads me to consider...
  • Drop the renaming functionality of case entirely. If we can't implement it in a robust way, perhaps it would be better to direct users to induction with ... instead. This provides the same functionality, but doesn't suffer from the issue that we have to pass information along arbitrary chains of tactics. Tags could then return to being just a list of names and case would just look for a tag with matching names. However, this design would break a bunch of stuff and is somewhat less convenient in the common case.

Mario Carneiro said:

Why are there multiple constructor tags on a case tag if not for this?

Fair. case could be extended such that case c_1 ... c_n looks for a tag which includes the c_i in the same order.

Mario Carneiro said:

I think you can store the _case/_field marker in the head of the expr

Sure. I find it a little cleaner to have it outside so that the expr really is arbitrary data that will only ever be inspected by tactics that deal with that particular sort of tag. But this is more aesthetic preference than anything.

Mario Carneiro said:

I don't know what the top level list is for in your design

This is so that goals can have multiple tags, e.g. a goal could be tagged with a field tag and a case tag if we do a case split while defining a field. I'm also thinking of potential future uses of tags (e.g. grouping related goals or hiding goals) where it would make sense for a goal to have multiple tags.

Jannis Limperg (May 11 2020 at 16:36):

The PR is finally here.

I've gone for a middle of the road design which leaves the internal representation of goal tags alone and is mostly compatible with the current behaviour of cases/induction/apply/with_cases/case (so users shouldn't need to care in most cases [pun not intended]). I didn't want to do a more invasive redesign because (a) it's not clear what exactly that redesign should be and (b) all this is going to get killed by Lean 4 anyway.


Last updated: Dec 20 2023 at 11:08 UTC