Zulip Chat Archive

Stream: lean4

Topic: Running out of heartbeats


Terence Tao (Oct 25 2023 at 05:07):

I'm encountering a new problem in my formalization project in that Lean is beginning to complain about running out of heartbeats and returning weird errors. I can temporarily get around this by setting the max_heartbeat parameter to something large like 400000, and also splitting my project into many files each of which are compiled separately seems to help, but I am wondering if I am somehow coding extremely inefficiently (certainly Lean is now taking quite some time to compile each file). Is there a way to detect which portions of a Lean proof are particularly resource intensive?

Kyle Miller (Oct 25 2023 at 05:10):

Are you familiar with locally setting parameters? For example, set_option maxHeartbeats 400000 in theorem myResult ... lets you set it theorem-by-theorem. You can also put set_option inside of a section ... end block to localize it.

Kyle Miller (Oct 25 2023 at 05:12):

(I'll mention also that there's a #where command you can use to see the current values of all set options, in case that's helpful.)

Terence Tao (Oct 25 2023 at 05:14):

Okay... but is the standard thing to do just to increase the maxHeartbeats parameter for any theorem that Lean starts barfing at? I'm wondering if its a sign that I'm relying too much on overpowered tactics or something.

Kyle Miller (Oct 25 2023 at 05:16):

You can also turn on profiler messages with set_option trace.profiler true to see what might be taking the most time

Kyle Miller (Oct 25 2023 at 05:17):

maxHeartbeats is not ideal... maybe @Matthew Ballard has tips, since he seems to have been leading the charge to eliminate them from mathlib4

Kyle Miller (Oct 25 2023 at 05:20):

If you're able to localize the slow theorem and you're able to share it here, maybe it's possible to give more precise advice, even if it's not a #mwe

Terence Tao (Oct 25 2023 at 05:22):

Okay... but it's a mess, and it relies on other results in my project, so it won't compile standalone

lemma iterated_rev {n k : } {s :   } (h2 : attainable n s) (h3 : k+3  n) (h4 : 0 < k): |s n|^((n:)⁻¹)  max (((2:) * n)^((Real.log (n-1) - Real.log (n-k-1))*(n-k)/(2*k)) * |s k|^((k:)⁻¹)) (( (2:)*n)^((Real.log (n-1) - Real.log (n-(k+1:)-1))*(n-(k+1))/(2*(k+1))) * |s (k+1)|^(((k+1:):)⁻¹)) := by
  have : n > (0:) := by norm_cast; linarith
  by_cases hsn: s n = 0
  . simp [hsn]
    left
    rw [zero_rpow]
    all_goals {positivity}
  let s' := fun k  s (n - k) / s n
  have bound := iterated (attainable_reflect h2 hsn) h3
  simp [attainable_zero_eq_one h2] at bound 
  rw [abs_inv, abs_div, abs_div, div_rpow, div_rpow, inv_rpow, mul_div, mul_div] at bound
  . rcases bound with bound | bound
    . left
      have h0 : n - (n-k) = k := by
        suffices : k + (n-k) = n
        . nth_rewrite 1 [<-this]
          apply Nat.add_sub_cancel
        apply Nat.add_sub_of_le
        linarith
      have h5 : 0 < (n:)-k := by
        simp; norm_cast; linarith
      have : 0 < ((n:)-k)/k := div_pos h5 (show (0:) < k by norm_cast)
      rw [le_div_iff, inv_mul_eq_div, <-rpow_sub, <- rpow_le_rpow_iff _ _ this, <-rpow_mul, mul_rpow, <-rpow_mul, <-rpow_mul, h0] at bound
      . convert bound using 2
        . field_simp; ring
        . congr 1; field_simp
        congr 1; field_simp
      all_goals {positivity}
    right -- could possibly shorten this proof by judicious use of swap, let, all_goals, and try
    have h0 : n - (n-(k+1)) = k+1 := by
      suffices : k+1 + (n-(k+1)) = n
      . nth_rewrite 1 [<-this]
        apply Nat.add_sub_cancel
      apply Nat.add_sub_of_le
      linarith
    have h5 : 0 < (n:)-(k+1) := by
      simp; norm_cast; linarith
    have : 0 < ((n:)-(k+1))/(k+1) := div_pos h5 (show (0:) < k+1 by norm_cast; linarith)
    rw [le_div_iff, inv_mul_eq_div, <-rpow_sub, <- rpow_le_rpow_iff _ _ this, <-rpow_mul, mul_rpow, <-rpow_mul, <-rpow_mul, h0] at bound
    . convert bound using 2
      . field_simp; ring
      . congr 1; field_simp
      congr 1; field_simp
    all_goals {positivity}
  all_goals {positivity}

It was the second convert that triggered the error, but I don't know whether that's because convert is particularly expensive. I'm also using an experimental extension to positivity that could possibly be the culprit.

Mario Carneiro (Oct 25 2023 at 05:27):

protip, the {} after all_goals is not necessary

Mario Carneiro (Oct 25 2023 at 05:28):

Unfortunately it most likely does need to be a MWE (i.e. fully working with all imports) to debug this, meaning you may need to extract or stub out all the definitions from your project (or link to your project and get someone else to do the hatchet job)

Kyle Miller (Oct 25 2023 at 05:29):

If you do set_option trace.profiler true and then increase the maxHeartbeats until the proof works, it might be interesting to see which tactics are taking the most time. Maybe the second convert is merely unlucky to be the last straw, or maybe it really is expensive. (There's nothing that looks like it ought to be particularly slow from just a cursory read-through.)

Mario Carneiro (Oct 25 2023 at 05:29):

It's also probably not convert itself that is expensive, but rather some definitional equality goal or typeclass inference problem it is kicking up

Terence Tao (Oct 25 2023 at 05:31):

Do heartbeats reset to zero at the end of each theorem? This particular result is compiling with heartbeats = 400000, so one option I have is simply to set it equal to this high value for this theorem alone and not try to figure out what is going wrong

Kyle Miller (Oct 25 2023 at 05:35):

Yes, it does. Mathlib also has a count_heartbeats in command to generate a reasonable maxHeartbeats count automatically -- just put it right before your theorem. (The reason for this tool, rather than just not having a maxHeartbeats, is that for a large library like mathlib, these act as high-water marks to detect when certain particularly slow lemmas gave gotten unacceptably slower due to unrelated changes.)

Terence Tao (Oct 25 2023 at 05:35):

The trace profiler is reporting how many seconds it takes to prove the lemma (about 40 seconds right now), as well as the linter (0.3 s), but not any finer detail than that. Would one have to try to break up the lemma into sublemmas to isolate the source of the high compute cost?

Kevin Buzzard (Oct 25 2023 at 05:38):

You can set_option trace.profiler true in before the statement and then you'll get a detailed breakdown of times taken by many things including all tactics. You can unfold the trace further by clicking on the little triangles.

Kevin Buzzard (Oct 25 2023 at 05:38):

Just chase the big number.

Terence Tao (Oct 25 2023 at 05:46):

Is it normal for the "count_heartbeats" command to throw up an "unexpected token" error while also reporting the number of heartbeats used?

Kyle Miller (Oct 25 2023 at 05:47):

Is there an in after count_heartbeats? It sounds like it might be missing -- if it is, then it will complain about a syntax error but likely continue to operate.

Terence Tao (Oct 25 2023 at 05:49):

Either set_option trace.profiler true in or count_heartbeats in in the line before my lemma will report an unexpected token error while also reporting data. But I don't see how the triangles Kevin says should be there to unfold the trace further. This is what I see:

image.png

Terence Tao (Oct 25 2023 at 05:50):

Oh wait, I see the triangles now.

Kyle Miller (Oct 25 2023 at 05:52):

Oh, that syntax error is because docstrings are part of the lemma command syntax (I am aware of this, but I often forget and re-surprise myself). The count_heartbeats in should come before the /-- (same for set_option ... in). The results should still be valid in this case despite the syntax error.

Mauricio Collares (Oct 25 2023 at 05:52):

convert bound using 2 is indeed the bottleneck, in particular these two defeq checks:

 (2 * n) ^ ((Real.log (n - 1) - Real.log (n - k - 1)) * (n - k) / (2 * k))
    =?= (2 * n) ^ ((Real.log (n - 1) - Real.log (n - k - 1)) / 2 * ((n - k) / k))

 |s k| ^ (k)⁻¹ =?= |s k| ^ ((n - k)⁻¹ * ((n - k) / k))

Terence Tao (Oct 25 2023 at 05:53):

Each of the convert's is taking 23 seconds each, with defeq being the problem

Terence Tao (Oct 25 2023 at 05:54):

Hmm. What would you recommend to deal with this? I could avoid using convert and write out my subgoals manually.

Mario Carneiro (Oct 25 2023 at 05:58):

if you are lucky, this is due to a bug in lean which can be fixed once the problem is isolated

Terence Tao (Oct 25 2023 at 06:00):

Should I try to create a MWE then? (Or has Mauricio already managed to make a version of my code run already?)

Mario Carneiro (Oct 25 2023 at 06:00):

Actually, based on Mauricio's diagnosis I think the fix will be to add irreducible more aggressively in the library

Kyle Miller (Oct 25 2023 at 06:00):

Or if you want to try two random things first, it seems like this might be the stage of convert where it tries to close goals using rfl. You might be able to speed it up by using convert bound using 3 (or with a higher number still), which will involve a defeq check with smaller terms. You might also have success using with_reducible convert bound using 2, which causes it to use less-expensive defeq checks here (though this has a chance of failing spectacularly).

Mario Carneiro (Oct 25 2023 at 06:01):

maybe we should just do that by default

Mauricio Collares (Oct 25 2023 at 06:02):

Terence Tao said:

Should I try to create a MWE then? (Or has Mauricio already managed to make a version of my code run already?)

I replaced the previous definition of iterated_rev at prev_bound.lean from http://github.com/teorth/symmetric_project by Tao's code above, if anyone wants to run it locally too.

Terence Tao (Oct 25 2023 at 06:03):

changing convert bound with 2 to convert bound with 3 reduces the run time from 43 secs to 8 secs but the proof breaks down now (it is trying to match things at so granular a level that they don't actually match any more).

Mario Carneiro (Oct 25 2023 at 06:06):

does with_reducible convert bound using 2 help?

Kyle Miller (Oct 25 2023 at 06:06):

I'll also mention convert (config := {closePost := false}) bound with 2 if you want to just turn off this slow feature of convert, at the cost of perhaps ending up with more goals than expected.

Terence Tao (Oct 25 2023 at 06:06):

Similarly for with_reducible convert bound using 2... run time is about 9 seconds, but it is matching the wrong things

Terence Tao (Oct 25 2023 at 06:07):

e.g., trying to match Preorder.toLE = instLEReal

Mario Carneiro (Oct 25 2023 at 06:07):

oh, if using 2 is also giving false goals then you will need to reduce it

Terence Tao (Oct 25 2023 at 06:07):

convert bound with 2 works, it's just extremely expensive

Mario Carneiro (Oct 25 2023 at 06:07):

the general idea is to set it to the highest value for which the subgoals are actually true

Mario Carneiro (Oct 25 2023 at 06:08):

setting aside runtime concerns

Terence Tao (Oct 25 2023 at 06:09):

Yeah, which in my case is 2

Mauricio Collares (Oct 25 2023 at 06:09):

By the way, using mathlib master (and therefore Lean 4.2.0-rc4), count_heartbeats reports "Used 75262 heartbeats, which is less than the current maximum of 200000."

Mauricio Collares (Oct 25 2023 at 06:11):

The mathlib bump exposes a possible linarith bug in roots_deriv.lean:77:4, though:

application type mismatch
  neg_nonpos_of_nonneg (Linarith.nat_cast_nonneg (WithBot ) m)
argument
  Linarith.nat_cast_nonneg (WithBot ) m
has type
  @OfNat.ofNat (WithBot ) 0 (@Zero.toOfNat0 (WithBot ) MonoidWithZero.toZero)  m : Prop
but is expected to have type
  @OfNat.ofNat (WithBot ) 0 (@Zero.toOfNat0 (WithBot ) NegZeroClass.toZero)  ?a : Prop

In this particular case I just replaced linarith by exact mne.symm, but it might be worth looking at. Other than this and replacing powersetLen by powersetCard everywhere, the update works fine.

Terence Tao (Oct 25 2023 at 06:11):

convert (config := {closePost := false}) bound with 2 runs blindingly fast, but creates werid goals like HMul.hMul = HDiv.hDiv

Terence Tao (Oct 25 2023 at 06:14):

Mauricio Collares said:

The mathlib bump exposes a possible linarith bug in roots_deriv.lean:77:4, though:

Is this something I should fix on my end? That particular line of code runs fine with the version of Mathlib that I have.

Mario Carneiro (Oct 25 2023 at 06:16):

@Mauricio Collares can you make an MWE for the linarith issue?

Terence Tao (Oct 25 2023 at 06:22):

Mauricio Collares said:

By the way, using mathlib master (and therefore Lean 4.2.0-rc4), count_heartbeats reports "Used 75262 heartbeats, which is less than the current maximum of 200000."

So does this mean that whatever weird issue I am having with convert has already become significantly less of an issue with the latest version of mathlib? With my build it is 331174 heartbeats.

Mauricio Collares (Oct 25 2023 at 06:26):

Mario Carneiro said:

Mauricio Collares can you make an MWE for the linarith issue?

Hopefully this is not too minimized:

import Mathlib.Data.Real.Basic
import Mathlib.Data.Polynomial.Derivative

open Polynomial

example bla := by
  have m :  := 0
  let P : [X] := X

  have degP : P.degree = m+1 := by sorry
  have P'ne0 : P  0 := by
    linarith

Mauricio Collares (Oct 25 2023 at 06:28):

Terence Tao said:

Mauricio Collares said:

By the way, using mathlib master (and therefore Lean 4.2.0-rc4), count_heartbeats reports "Used 75262 heartbeats, which is less than the current maximum of 200000."

So does this mean that whatever weird issue I am having with convert has already become significantly less of an issue with the latest version of mathlib? With my build it is 331174 heartbeats.

Yes, I believe so. There have been some caching improvements in defeq checks lately. With regards to the linarith issue, if Mario confirms that it really is a linarith bug, then there's nothing you need to do on your end to fix it; you can choose to either work around it to upgrade Mathlib sooner, though (but I bet Mario will fix this soon if it is a bug).

Terence Tao (Oct 25 2023 at 06:29):

OK. Then I will just raise the heartbeats limit for this particular theorem and continue on with my project. Thanks for the help!

Kyle Miller (Oct 25 2023 at 06:30):

Terence Tao said:

convert (config := {closePost := false}) bound with 2 runs blindingly fast, but creates weird goals like HMul.hMul = HDiv.hDiv

This was too weird (not impossible, but unwanted) so I just pulled up your project to check. It looks like each convert creates a Preorder.toLE = instLEReal goal -- you can dispatch these with a rfl.

This takes 7 seconds overall for me:

code

Terence Tao (Oct 25 2023 at 06:37):

Hmm, that works here now. Not sure where all the HMul.hMul = HDiv.hDiv goals disappeared to, but I guess I won't question it. The code now works and is not excessively slow, so I guess I'm satisfied.

Mario Carneiro (Oct 25 2023 at 06:47):

Mauricio Collares said:

Mario Carneiro said:

Mauricio Collares can you make an MWE for the linarith issue?

Hopefully this is not too minimized:

It looks like it is not a bug, to first order: it is failing to find an instance of OrderedAddCommGroup (WithBot Nat) which makes sense since it is not true

Mario Carneiro (Oct 25 2023 at 06:51):

What's happening is that it has noticed the degP assumption and it has taken the fact 0 <= m (because m : Nat), lifted it to WithBot Nat to be useful in conjunction with degP, then tried to flip it around to the normal form -m <= 0 and ran into trouble because WithBot Nat doesn't have negation

Damiano Testa (Oct 25 2023 at 06:52):

In case anyone is interested, this is a "minimization up to mathlib" of the code pasted above (with Kyle's speed up).

Mauricio Collares (Oct 25 2023 at 06:53):

Mario Carneiro said:

What's happening is that it has noticed the degP assumption and it has taken the fact 0 <= m (because m : Nat), lifted it to WithBot Nat to be useful in conjunction with degP, then tried to flip it around to the normal form -m <= 0 and ran into trouble because WithBot Nat doesn't have negation

It might not be a bug, but it used to work in Tao's code (maybe I removed important parts while minimizing, though).

Mauricio Collares (Oct 25 2023 at 06:53):

I can bisect later if that helps

Mario Carneiro (Oct 25 2023 at 06:54):

No it's a bug for sure, but the proof is failing for a good reason

Mario Carneiro (Oct 25 2023 at 06:54):

oh man src#Lean.Exception.isFailedToSynthesize is a total hack

Mario Carneiro (Oct 25 2023 at 06:55):

I think some inference changes in lean changed what error is reported

Kevin Buzzard (Oct 25 2023 at 07:04):

Damiano Testa said:

In case anyone is interested, this is a "minimization up to mathlib" of the code pasted above (with Kyle's speed up).


I'm getting unknown identifier 'powersetLen' on line 20 with this.

Mauricio Collares (Oct 25 2023 at 07:10):

powersetLen was renamed to powersetCard in mathlib recently

Mario Carneiro (Oct 25 2023 at 07:14):

Just checking, but you aren't expecting linarith to prove your example, right @Mauricio Collares ?

Damiano Testa (Oct 25 2023 at 07:23):

Ah, I did not paste in a current version of Mathlib, just the one in Terry's project.

Damiano Testa (Oct 25 2023 at 07:24):

(and I'm on mobile and then teaching, so won't be able to fix for a while)

Mario Carneiro (Oct 25 2023 at 07:26):

the linarith issue is fixed in #7916

Damiano Testa (Oct 25 2023 at 07:55):

(Since I got to my office earlier than expected, I checked against the online Lean server: Mauricio caught the only issue, I believe and the code I pasted above should work with the current Mathlib.)

Mauricio Collares (Oct 25 2023 at 08:16):

Mario Carneiro said:

Just checking, but you aren't expecting linarith to prove your example, right Mauricio Collares ?

I wasn't expecting it to prove anything in the minimized example. I've checked and your fix makes linarith work again in Tao's project. Thanks!

Matthew Ballard (Oct 25 2023 at 12:39):

With #7916 and the bump to rc4, there are no issues left here correct?


Last updated: Dec 20 2023 at 11:08 UTC