Zulip Chat Archive

Stream: lean4

Topic: Bug in `refine`


Thomas Murrills (Aug 16 2023 at 20:51):

It seems that refine will remove pre-existing goals from the goal list even if they're unsolved:

example : True  True := by -- (kernel) declaration has metavariables '_example'
  constructor
  · refine ?left
  · refine ?right

refine ?left involves running elabTermWithHoles on ?left, and returns (?left, []). I'm not sure if elabTermWithHoles is intended to return all unsolved metavariables in the expression, or merely all new metavariables created during elaboration.

I suspect that some inconsistency in elabTermWithHoles might be the source of this one way or another, since using refine' (which just flips allowNaturalHoles to true in the arguments of elabTermWithHoles) solves the issue, and returns (?left, [?left]) internally:

example : True  True := by
  constructor
  · refine' ?left -- unsolved goals ...
  · refine' ?right -- unsolved goals ...

Intuitively, I'd expect the list of returned mvars to be consistent here whether we allow natural holes in stx or not. Is there a reason it should differ?

Thomas Murrills (Aug 16 2023 at 21:20):

Oh, I see what's happening: in withCollectingNewGoalsFrom, we don't distinguish between natural metavariables created before elaboration and those created during when creating the list of new metavariables, even though we do when logging errors. constructor creates natural metavariables, which then get ignored in the allowNaturalHoles := false case (which only returns synthetic mvars).

Thomas Murrills (Aug 16 2023 at 21:21):

The relevant code:

    ...
    let newMVarIds  if allowNaturalHoles then
      pure newMVarIds.toList
    else
      let naturalMVarIds  newMVarIds.filterM fun mvarId => return ( mvarId.getKind).isNatural
      let syntheticMVarIds  newMVarIds.filterM fun mvarId => return !( mvarId.getKind).isNatural
      let naturalMVarIds  filterOldMVars naturalMVarIds mvarCounterSaved
      logUnassignedAndAbort naturalMVarIds
      pure syntheticMVarIds.toList
    ...

Thomas Murrills (Aug 16 2023 at 21:44):

I suppose the else block should look something like this instead:

      ...
      let naturalMVarIds  newMVarIds.filterM fun mvarId => return ( mvarId.getKind).isNatural
      let syntheticMVarIds  newMVarIds.filterM fun mvarId => return !( mvarId.getKind).isNatural
      let (newNaturalMVarIds, oldNaturalMVarIds) 
        partitionNewAndOldMVars naturalMVarIds mvarCounterSaved
      logUnassignedAndAbort newNaturalMVarIds
      pure (oldNaturalMVarIds ++ syntheticMVarIds).toList
      ...

where, in parallel to filterOldMVars,

def partitionNewAndOldMVars (mvarIds : Array MVarId) (mvarCounterSaved : Nat)
    : MetaM (Array MVarId × Array MVarId) := do
  let mctx  getMCtx
  return mvarIds.partition fun mvarId => (mctx.getDecl mvarId |>.index) >= mvarCounterSaved

This seems to work when testing locally.

If it'd be appreciated, I can open an issue and/or PR on the lean4 repo; just let me know. :)

Scott Morrison (Aug 16 2023 at 22:56):

This does look like a bug. Opening an issue would be great.

I haven't looked at the code yet, but something seems fishy in your diagnosis. At very least newMVarIds is mis-named if a moment later you are extracting a subset of it called oldNaturalMVarIds, and I think getting to the bottom of that might be essential.

If you'd like to open a PR as well:

  • open the issue first
  • write as many tests as you can think of!
  • make sure there is a comment at the call site of partitionNewAndOldMVars explaining why it is necessary there
  • have a doc-string for partitionNewAndOldMVars.

Thomas Murrills (Aug 19 2023 at 07:05):

Ok, lean4#2434 and lean4#2435 (draft PR, but please feel free to upgrade if it passes CI) :)

Thomas Murrills (Aug 19 2023 at 07:07):

(Note: I hope the issue text isn't too long; I tried to summarize it succinctly up front, and relegate the "proof"/argumentation that this is actually what's going on to the Additional Information section. If this is discouraged, I can shorten it.)

Thomas Murrills (Aug 31 2023 at 08:34):

So, update/context: fixing this bug meant not only changing how refine behaves, but changing how elabTermWithHoles behaves. This has—perhaps unsurprisingly—broken some things in mathlib, notably change. As such, lean4#2435 was reverted while we fix change (#6888).

However, some proofs (~10-20, judging by the build output on a custom branch) using just refine will also break, because refine now includes some mvars in its output that it didn't before. Looking at some examples, either the change in goal ordering was what made the proofs break, or the proofs were working around the goal-discarding behavior by jumping back out to a prior focus, e.g.:

  · refine foo -- erroneously discards existing goal `?a`, producing `case b`, `case c`
    · tac1 -- solve `?b`
    · tac2 -- solve `?c`
  tac3 -- solve `?a` out here

Thomas Murrills (Aug 31 2023 at 08:35):

Separately, this also happened to reveal that refine (unmodified by any bugfix) can duplicate goals in the infoview. (This is a cosmetic bug only; they have the same MVarId.) #mwe:

example : True := by
  have : True := ?a
  refine ?a -- two case `a`s in the infoview

The reason I bring up this other bug is that fixing it could break all the same proofs (depending on how we fix it), and I'd like to make the least noise possible, i.e. only fix the proofs once. With lean4#2435 reverted, it might make sense to do both at once, especially since fixing them might involve acting on the same part of the code.

Thomas Murrills (Aug 31 2023 at 08:44):

However, the following example shows how figuring out the "right" behavior might be a bit subtle.

axiom g {α} : α  True -- takes an implicit type argument

example : True := by
  have : ?a := ?b -- such that ?b : ?a
  refine g ?b -- elaborates to include both an explicit and implicit pre-existing mvar (@g ?a ?b)

What should the user expect to see in the infoview in this situation, and why? (Note: currently, it produces case a, case b, case a, case b.)

A. On first glance I'd think they expect to see case b, case a, because ?b is explicit in the refined syntax. Even this can be achieved (and thought of) in two ways, though:

  1. don't return ?a from refine at all, because it existed previously and was not explicitly mentioned; remove ?b from the remaining goal list because it appears in the output of refine
  2. or: re-sort the goals resulting from refine (putting ?b at the top because it appears in the syntax), and remove them from the remaining goal list.

B. But there's a good argument for case a, case b too: ?a and ?b already exist as goals after the have, so refine shouldn't even return them, as they're not new goals created during elaboration. (This fix is the most straightforward to implement.)

C. You can say that ?a and ?b appear in the elaborated expression and therefore should be returned by refine, but to my mind this sets up users of refine for a gotcha and makes the behavior less predictable and transparent.

If I were choosing, I'd go with option B: if someone is explicitly referencing an existing goal, they probably don't need refine to, in effect, simply move it up higher on the goal list. Plus, it's easy to reason about as a user, and lends itself to a clear implementation.

I realize this might be a lot of thought about a minor issue—but refine is pretty well-used in tactic proofs, so maybe it's worth it for it to be solid.

Thomas Murrills (Aug 31 2023 at 08:51):

Issue: lean4#2495

Scott Morrison (Aug 31 2023 at 09:22):

I worry that you are setting yourself up for never getting anything changes but wanting to fix both these issues at once. I'd suggest work out a way to resolve one, do that, then the other. How about assume for now that the behaviour of refine in core is not going to be changed quickly. What is best in that scenario?

Thomas Murrills (Aug 31 2023 at 11:17):

Good point, thanks for the perspective. I was too focused on the fact that the second issue could potentially render the first moot, and probably underestimated how much more discussion and reviewing would have to go into it before we got there. :)

I suppose the path with the least friction and quickest resulting changes here is:

  1. I confirm that all breakages of refine from lean4#2435 are indeed simple "rearrangement issues", and prepare an anticipatory bump PR for Mathlib (on top of #6888, and using the PR release temporarily for CI);
  2. we merge #6888 (after some tests/docs are added)
  3. we restore lean4#2435 (via lean4#2493)
  4. we bump Mathlib's Lean version

(Then we deal with the second issue separately.) Does that sound more realistic?

Kyle Miller (Aug 31 2023 at 12:15):

With your very first example, it seems to me that the main bug is that refine's occurs checks aren't working properly. In refine ?left, I believe this should either be an error or specifically be coded to be a no-op. Note that refine id ?left does correctly trigger an error.

Let's take a look at refineCore:

def refineCore (stx : Syntax) (tagSuffix : Name) (allowNaturalHoles : Bool) : TacticM Unit := do
  withMainContext do
    let (val, mvarIds')  elabTermWithHoles stx ( getMainTarget) tagSuffix allowNaturalHoles
    let mvarId  getMainGoal
    let val  instantiateMVars val
    unless val == mkMVar mvarId do
      if val.findMVar? (· == mvarId) matches some _ then
        throwError "'refine' tactic failed, value{indentExpr val}\ndepends on the main goal metavariable '{mkMVar mvarId}'"
      mvarId.assign val
    replaceMainGoal mvarIds'

Notice that there is special handling for whether the val is mvarid itself, which is to skip assigning mvarId. But, it still does replaceMainGoal with the wrong mvarIds'!

I would suggest this simple change to fix refine:

def refineCore (stx : Syntax) (tagSuffix : Name) (allowNaturalHoles : Bool) : TacticM Unit := do
  withMainContext do
    let (val, mvarIds')  elabTermWithHoles stx ( getMainTarget) tagSuffix allowNaturalHoles
    let mvarId  getMainGoal
    let val  instantiateMVars val
    unless val == mkMVar mvarId do
      if val.findMVar? (· == mvarId) matches some _ then
        throwError "'refine' tactic failed, value{indentExpr val}\ndepends on the main goal metavariable '{mkMVar mvarId}'"
      mvarId.assign val
      replaceMainGoal mvarIds'

This causes refine ?left to be a no-op.

Kyle Miller (Aug 31 2023 at 12:21):

There's the separate issue of duplicating goals in the example : True := by have : True := ?a; refine ?a example.

For this, could you explain why the problem in withCollectingNewGoalsFrom isn't fixed by just changing

    let newMVarIds  if allowNaturalHoles then
      pure newMVarIds.toList
    else
      let naturalMVarIds  newMVarIds.filterM fun mvarId => return ( mvarId.getKind).isNatural
      let syntheticMVarIds  newMVarIds.filterM fun mvarId => return !( mvarId.getKind).isNatural
      let naturalMVarIds  filterOldMVars naturalMVarIds mvarCounterSaved
      logUnassignedAndAbort naturalMVarIds
      pure syntheticMVarIds.toList

to

    let newMVarIds  if allowNaturalHoles then
      pure newMVarIds.toList
    else
      let naturalMVarIds  newMVarIds.filterM fun mvarId => return ( mvarId.getKind).isNatural
      let syntheticMVarIds  newMVarIds.filterM fun mvarId => return !( mvarId.getKind).isNatural
      let naturalMVarIds  filterOldMVars naturalMVarIds mvarCounterSaved
      logUnassignedAndAbort naturalMVarIds
      pure syntheticMVarIds.toList
    let newMVarIds  filterOldMVars naturalMVarIds mvarCounterSaved

?

It seems odd calling it newMVarIds if they aren't new, so maybe it's an oversight that they're not being filtered?

Thomas Murrills (Aug 31 2023 at 22:59):

Re: the occurs check, I think it's functioning as intended, even if I also wouldn't necessarily intend it that way! The test file refineOccursCheck.lean specifically ensures this:

theorem ex1 : True := by
  refine ?a
  refine ?a
  refine ?a
  exact True.intro

(This is intended to succeed, and shows that it's not the source of the bug in general.)

EDIT: I guess this behavior would be achieved by by a no-op too. But if you expect elabTermWithHoles to return all holes in the expression, then you expect it to also include the main goal in mvarIds', so we don't replace the main goal with the "wrong" list. It's internally consistent, at least.

Thomas Murrills (Aug 31 2023 at 22:59):

The main bug (re: refine ?left) is definitely about losing track of existing natural mvars, I believe; see the examples in the additional information section of lean4#2434, as well as the test file refinePreservesNaturalMVars.lean in lean4#2435, which test with natural mvars that are not the main goal. (Note that newMVarIds (which is just what withCollectingNewGoalsFrom calls all of the mvars it collects, for some reason) becomes simply syntheticMVarIds.toList in the allowNaturalHoles := false case. The natural MVarIds are lost.)

Thomas Murrills (Aug 31 2023 at 23:10):

Kyle Miller said:

There's the separate issue of duplicating goals in the example : True := by have : True := ?a; refine ?a example.

For this, could you explain why the problem in withCollectingNewGoalsFrom isn't fixed by just ...

Indeed, simply adding the line let newMVarIds ← filterOldMVars newMVarIds mvarCounterSaved fixes this! This is option B. that I mentioned, and that's what I was talking about when mentioning it was easier to implement. :) It would probably break several things, though; I noticed it already breaks one test.

I started exploring this on the (very WIP) draft PR lean4#2496 last night. Note that I take the "cautious" route of adding an optional toggle for this behavior to elabTermWithHoles and withCollectingNewGoalsFrom (onlyNewGoals := false) so as to not break existing other uses of that function in mathlib. Plus, sometimes you might want to collect all the holes that appear in an expression when using elabTermWithHoles, not just the new ones. (That would also mean we should probably rename withCollectingNewGoalsFrom to withCollectingGoalsFrom.)

I'd need to look at the existing uses in mathlib to see if we ever actually do want to collect all goals, though. If they all should use only new goals, we could just change the behavior (and that would probably make sense for defs named elabTermWithHoles and withCollectingNewGoalsFrom).

Kyle Miller (Aug 31 2023 at 23:46):

My understanding of docs#Lean.Elab.Tactic.withCollectingNewGoalsFrom is that it's meant for collecting new goals from an expression (hence its name). That's what's useful if you write a tactic that elaborates an expression and you need to create some side-goals for any remaining metavariables that haven't already been accounted for. The name of elabTermWithHoles at least isn't promising that it returns all holes -- and its definition is a one-liner calling withCollectingNewGoalsFrom, so the intent seems clear. It could also have been called elabTermCollectingNewGoals, and that would be what I'd want if I were to implement refine from scratch.

I find it very hard to support the argument that elaborating an expression should be able to capture pre-existing goals to create new goals. It seems really hard to reason about goal order if you allow this. (I think that's what you're saying in C.) It's also breaking what I've assumed is the invariant that each goal in the goal list has multiplicity one.

Re the occurs check: allowing refine ?foo with ?foo being the current goal is sort of an absurd thing to do, so it might be worth looking into the git blame to see why this was special cased.

Kyle Miller (Aug 31 2023 at 23:49):

Re mathlib, I know that every time I've used elabTermWithHoles I had thought it was only returning metavariables that otherwise weren't going to be handled by anyone (i.e., things that should be filled in by the user as a goal).

Thomas Murrills (Sep 01 2023 at 00:00):

That sounds entirely reasonable to me! It's what I expected to happen until I read through the code too; I'd be happy for it to change. It would solve both original issues in one fell swoop (lean4#2434 and lean4#2495) (and might mean that change doesn't need pre-fixing via #6888).

Note that we could then replace the whole block

   let newMVarIds  if allowNaturalHoles then
      pure newMVarIds.toList
    else
      let naturalMVarIds  newMVarIds.filterM fun mvarId => return ( mvarId.getKind).isNatural
      let syntheticMVarIds  newMVarIds.filterM fun mvarId => return !( mvarId.getKind).isNatural
      let naturalMVarIds  filterOldMVars naturalMVarIds mvarCounterSaved
      logUnassignedAndAbort naturalMVarIds
      pure syntheticMVarIds.toList

with just

   let newMVarIds  filterOldMVars newMVarIds mvarCounterSaved
   unless allowNaturalHoles do
      let naturalMVarIds  newMVarIds.filterM fun mvarId => return ( mvarId.getKind).isNatural
      logUnassignedAndAbort naturalMVarIds

which is rather nice. :) I'll try this out and see how this affects mathlib on a draft lean PR.

Mario Carneiro (Sep 01 2023 at 00:45):

Kyle Miller said:

Re the occurs check: allowing refine ?foo with ?foo being the current goal is sort of an absurd thing to do, so it might be worth looking into the git blame to see why this was special cased.

I specifically recall getting some nasty stack overflows when playing around with refine applied to existing metavariables, so this was likely a bug fix

Mario Carneiro (Sep 01 2023 at 00:46):

The model of named metavariables is actually kind of broken, because they are not hygienic and you can get really surprising results because of this

Mario Carneiro (Sep 01 2023 at 00:49):

The stack overflow is still present:

example : True := by
  refine ?a
  exact id ?a

Kyle Miller (Sep 01 2023 at 01:09):

Interesting -- looks like exact doesn't actually have an occurs check (it only checks that there are no new metavariables remaining in the assigned expression). I've found the lack of occurs checks in meta code to be a bit worrisome, but I wouldn't have expected exact to have an issue.

Kyle Miller (Sep 01 2023 at 12:23):

I filed an issue for this: lean4#2504

Thomas Murrills (Sep 02 2023 at 20:11):

Ok, after exploring the "single fix" option (just filter all old mvars out) and fixing the resulting breakages in mathlib, my thoughts are that this is actually the smoother and least-friction option, despite being the bigger change! In particular

  • this streamlines the lean4 code
  • this resolves both lean4#2495 and lean4#2434 at once
  • the breakages to mathlib are minimal, and all come from refine or convert via the same phenomenon (goals "tunneling through" to appear in the goal list after a refine or convert, when in fact they were created and unsolved far earlier)
  • notably, the fix to change, #6888, is now wholly unnecessary (in fact, all #6888 did was filter out the old mvars in change)

Thomas Murrills (Sep 02 2023 at 20:12):

The mathlib PR #6928 is the anticipatory bump PR, and fixes all those breakages. There were 8 breakages arising from this change in total.

Thomas Murrills (Sep 02 2023 at 20:56):

The draft lean4 PR for this change is lean4#2502.

Thomas Murrills (Sep 02 2023 at 21:15):

(In writing the description, I noticed that specialize, which uses elabTermWithHoles, is also missing an occurs check and produces a stack overflow: lean4#2506)

Scott Morrison (Sep 03 2023 at 23:23):

@Thomas Murrills, would you mind closing the non-preferred PRs on lean4? At present one needs to read lots of context to work out what to look at and not look at.

Thomas Murrills (Sep 03 2023 at 23:24):

No problem, sorry, I assumed draft PRs were irrelevant unless noted! lean4#2502 should be self-contained (please let me know if I left some context out).

Scott Morrison (Sep 03 2023 at 23:29):

Draft PRs are certainly easy to avoid looking at. The problem is that you had 3 draft PRs, and I wanted to look at just one. :-)

Scott Morrison (Sep 03 2023 at 23:30):

@Thomas Murrills, is lean4#2434 meant to be closed?

Thomas Murrills (Sep 03 2023 at 23:32):

No, it was automatically closed when the original PR in this saga was merged, and was not reopened when that PR was reverted. (I wasn’t able to reopen it.)

Thomas Murrills (Sep 03 2023 at 23:32):

(I also figured it would be closed again soon in any case, and might not be worth the noise to reopen and re-close. :) )

Scott Morrison (Sep 03 2023 at 23:33):

I've reopened. :-) Your PR is confusing if it is referring to fixing already closed issues.

Thomas Murrills (Sep 03 2023 at 23:34):

Ah, right—that was the context I left out! I knew there was something. :)

Thomas Murrills (Sep 04 2023 at 00:02):

I’ve changed it to awaiting-review to indicate that I’m done with my work on it; just checking, is this a correct use of the label on the lean4 repo?

Thomas Murrills (Sep 04 2023 at 00:02):

Also, should I make it a full-fledged PR at this point, or should I wait for a sort of “pre-review” to finish here before doing so?

EDIT: I’ll wait at least until I do what’s suggested below. :)

Scott Morrison (Sep 04 2023 at 00:02):

@Thomas Murrills, re: the mathlib PR #6928. If you merge all of that over to lean-pr-testing-2502 (which is an automatically generated branch), then hopefully it will report back to the Lean 4 repository automatically that it builds-mathlib, rather than breaks-mathlib.

Scott Morrison (Sep 04 2023 at 00:02):

(This CI integration is still a work in progress, sorting out kinks, sorry.)

Scott Morrison (Sep 04 2023 at 00:02):

Thomas Murrills said:

I’ve changed it to awaiting-review to indicate that I’m done with my work on it; just checking, is this a correct use of the label on the lean4 repo?

Yes!

Scott Morrison (Sep 04 2023 at 00:04):

(You might even change #6928 so it is uses that branch.)

Thomas Murrills (Sep 04 2023 at 00:55):

Scott Morrison said:

(This CI integration is still a work in progress, sorting out kinks, sorry.)

Don’t worry at all—after all, progress can only be made by putting things in practice! :) Btw, I really appreciate the work you’ve done/are doing on CI and automation—I certainly wouldn’t have been able to make these PRs without the lean4 pr release framework. Or, at least, it would have taken a lot longer, and been a lot more work. :)

Thomas Murrills (Sep 04 2023 at 00:58):

I’ve merged the changes into lean4-pr-testing-2502 (but have to run and can’t change the mathlib4 PR branch right now); it’s currently building. :)

Scott Morrison (Sep 04 2023 at 02:33):

Hooray, the builds-mathlib label was bestowed automatically. :-)

Thomas Murrills (Sep 05 2023 at 23:03):

Shall I convert lean4#2502 from a draft PR to a normal PR now that the reviews have been addressed?

Scott Morrison (Sep 06 2023 at 00:05):

Yes, please.

Thomas Murrills (Sep 06 2023 at 00:23):

Done. :)

Thomas Murrills (Sep 20 2023 at 22:47):

I'm trying to keep the anticipatory bump PR #6928 up to date, but I'm hitting a weird error during the "check the cache" stage after build proper:

Run lake exe cache get
  lake exe cache get
  # We pipe the output of `lake build` to a file,
  # and if we find " Building Mathlib" in that file we kill `lake build`, and error.
  lake build > tmp & tail --pid=$! -n +1 -F tmp | (! (grep -m 1 " Building Mathlib" && kill $! ))
  shell: /usr/bin/bash -e {0}

No files to download
Decompressing 3776 file(s)
unpacked in 1736 ms
[1/5] Building Mathlib.Mathport.Rename
Error: Process completed with exit code 1.

What's going on here?


Last updated: Dec 20 2023 at 11:08 UTC