Zulip Chat Archive

Stream: new members

Topic: Performance in Large Projects


Rick Love (Aug 18 2020 at 14:05):

I noticed that performance becomes a problem with long calc chains. Can someone give me advice on what patterns work well for performance (in vscode):

i.e. What is the scope of effort that the logic engine has to perform on each change to the text?

What does the engine re-process on each change?

  • the file and all imports (whether or not they are used)
  • the entire logic chain (all imports and dependencies that are used)
  • everything in the changed file
  • only the theorem that has changed in the changed file
  • only the statements at and after the change in the changed file

I suspect it is the 3rd or 4th case, but it would be helpful to know the best way to organize files.

Specifically, my question is:

  • Should I move complex theorems to their own files and make sure not to import them unless they are required?
  • Is it enough to keep complex logic in its own theorem (even in the current file)?

Mario Carneiro (Aug 18 2020 at 14:19):

It is the 5th option, everything at and below (and possibly one statement above) the point of change

Mario Carneiro (Aug 18 2020 at 14:20):

In particular, all changes to a proof or tactic script require re-running the whole thing, which means that long proofs can get slow to interact with

Mario Carneiro (Aug 18 2020 at 14:21):

I would suggest trying to keep proof size down to maybe 50 lines max by extracting lemmas. If those lemmas are independently useful, even better

Rick Love (Aug 18 2020 at 14:25):

Mario Carneiro said:

In particular, all changes to a proof or tactic script require re-running the whole thing, which means that long proofs can get slow to interact with

So that sounds like the 4th option (the entire current theorem - when working on a tactic script), which would include the statements above the current.

I guess I meant by statement was a single tactic being used (not a whole theorem or lemma), maybe I'm using the wrong terms.

Rick Love (Aug 18 2020 at 14:27):

Mario Carneiro said:

I would suggest trying to keep proof size down to maybe 50 lines max by extracting lemmas. If those lemmas are independently useful, even better

Yeah, that makes sense. The instance where it occurred was more a prototype and was not organized well.

Mario Carneiro (Aug 18 2020 at 14:27):

"Statement" here means something starting with a command keyword like theorem or def

Mario Carneiro (Aug 18 2020 at 14:27):

the top level objects in a lean file

Mario Carneiro (Aug 18 2020 at 14:28):

in particular I don't mean tactic calls in a begin end block

Rick Love (Aug 18 2020 at 14:29):

Is there a specific term to refer to a single "tactic call"?

Mario Carneiro (Aug 18 2020 at 14:29):

a tactic line, tactic call, tactic

Mario Carneiro (Aug 18 2020 at 14:30):

the structure of such things is not very line based in general so you have to be a bit more specific

Mario Carneiro (Aug 18 2020 at 14:34):

Also it's possible you misused the term "calc block", which is used to describe a sequence of equality proofs starting with the calc keyword

Rick Love (Aug 18 2020 at 14:34):

What about "expression", when talking about other code that usually refers to an atomic unit of code, but I don't know if that has another meaning.

Kevin Buzzard (Aug 18 2020 at 14:34):

a calc block

Rick Love (Aug 18 2020 at 14:36):

Mario Carneiro said:

Also it's possible you misused the term "calc block", which is used to describe a sequence of equality proofs starting with the calc keyword

No, I was talking about the "calc" keyword specifically in that case. It was around 20 lines with around 100 tactics and got slow.

Mario Carneiro (Aug 18 2020 at 14:36):

"Expression" has a specific meaning in lean. An expression is any term used for denoting an element of the type theory, for example x or 2+2 or have foo, from bar, baz or show foo, from bar or begin tactics... end

Mario Carneiro (Aug 18 2020 at 14:36):

it's pretty odd to have a calc proof that is significantly expensive unless you are calling omega on every line or something, or the block is massive

Mario Carneiro (Aug 18 2020 at 14:37):

how do you manage to get 100 tactics on 20 lines unless you have some unusual (or shall we say "compact") formatting

Mario Carneiro (Aug 18 2020 at 14:37):

how many of those lines start with ... =?

Mario Carneiro (Aug 18 2020 at 14:39):

anyway, a proof getting slow is lean's way of telling you that your proof is too large and should be split into smaller parts

Chris Wong (Aug 19 2020 at 08:17):

I think because of proof irrelevance, Lean can start checking a proof before it has validated its dependencies? So splitting a proof can result in more parallelism than you might think

Mario Carneiro (Aug 19 2020 at 08:19):

Well, technically you can inspect the value of a proof from another proof, and of course #print will also look at proof terms, but this is basically joining on a future and is relatively rare in the process

Mario Carneiro (Aug 19 2020 at 08:20):

so for the most part it is just running the definitions and theorem statements on one thread and then spawning asynchronous proof elaboration tasks


Last updated: Dec 20 2023 at 11:08 UTC