Zulip Chat Archive
Stream: new members
Topic: Performance in Large Projects
Rick Love (Aug 18 2020 at 14:05):
I noticed that performance becomes a problem with long calc chains. Can someone give me advice on what patterns work well for performance (in vscode):
i.e. What is the scope of effort that the logic engine has to perform on each change to the text?
What does the engine re-process on each change?
- the file and all imports (whether or not they are used)
- the entire logic chain (all imports and dependencies that are used)
- everything in the changed file
- only the theorem that has changed in the changed file
- only the statements at and after the change in the changed file
I suspect it is the 3rd or 4th case, but it would be helpful to know the best way to organize files.
Specifically, my question is:
- Should I move complex theorems to their own files and make sure not to import them unless they are required?
- Is it enough to keep complex logic in its own theorem (even in the current file)?
Mario Carneiro (Aug 18 2020 at 14:19):
It is the 5th option, everything at and below (and possibly one statement above) the point of change
Mario Carneiro (Aug 18 2020 at 14:20):
In particular, all changes to a proof or tactic script require re-running the whole thing, which means that long proofs can get slow to interact with
Mario Carneiro (Aug 18 2020 at 14:21):
I would suggest trying to keep proof size down to maybe 50 lines max by extracting lemmas. If those lemmas are independently useful, even better
Rick Love (Aug 18 2020 at 14:25):
Mario Carneiro said:
In particular, all changes to a proof or tactic script require re-running the whole thing, which means that long proofs can get slow to interact with
So that sounds like the 4th option (the entire current theorem - when working on a tactic script), which would include the statements above the current.
I guess I meant by statement was a single tactic being used (not a whole theorem or lemma), maybe I'm using the wrong terms.
Rick Love (Aug 18 2020 at 14:27):
Mario Carneiro said:
I would suggest trying to keep proof size down to maybe 50 lines max by extracting lemmas. If those lemmas are independently useful, even better
Yeah, that makes sense. The instance where it occurred was more a prototype and was not organized well.
Mario Carneiro (Aug 18 2020 at 14:27):
"Statement" here means something starting with a command keyword like theorem
or def
Mario Carneiro (Aug 18 2020 at 14:27):
the top level objects in a lean file
Mario Carneiro (Aug 18 2020 at 14:28):
in particular I don't mean tactic calls in a begin end block
Rick Love (Aug 18 2020 at 14:29):
Is there a specific term to refer to a single "tactic call"?
Mario Carneiro (Aug 18 2020 at 14:29):
a tactic line, tactic call, tactic
Mario Carneiro (Aug 18 2020 at 14:30):
the structure of such things is not very line based in general so you have to be a bit more specific
Mario Carneiro (Aug 18 2020 at 14:34):
Also it's possible you misused the term "calc block", which is used to describe a sequence of equality proofs starting with the calc
keyword
Rick Love (Aug 18 2020 at 14:34):
What about "expression", when talking about other code that usually refers to an atomic unit of code, but I don't know if that has another meaning.
Kevin Buzzard (Aug 18 2020 at 14:34):
Rick Love (Aug 18 2020 at 14:36):
Mario Carneiro said:
Also it's possible you misused the term "calc block", which is used to describe a sequence of equality proofs starting with the
calc
keyword
No, I was talking about the "calc" keyword specifically in that case. It was around 20 lines with around 100 tactics and got slow.
Mario Carneiro (Aug 18 2020 at 14:36):
"Expression" has a specific meaning in lean. An expression is any term used for denoting an element of the type theory, for example x
or 2+2
or have foo, from bar, baz
or show foo, from bar
or begin tactics... end
Mario Carneiro (Aug 18 2020 at 14:36):
it's pretty odd to have a calc proof that is significantly expensive unless you are calling omega
on every line or something, or the block is massive
Mario Carneiro (Aug 18 2020 at 14:37):
how do you manage to get 100 tactics on 20 lines unless you have some unusual (or shall we say "compact") formatting
Mario Carneiro (Aug 18 2020 at 14:37):
how many of those lines start with ... =
?
Mario Carneiro (Aug 18 2020 at 14:39):
anyway, a proof getting slow is lean's way of telling you that your proof is too large and should be split into smaller parts
Chris Wong (Aug 19 2020 at 08:17):
I think because of proof irrelevance, Lean can start checking a proof before it has validated its dependencies? So splitting a proof can result in more parallelism than you might think
Mario Carneiro (Aug 19 2020 at 08:19):
Well, technically you can inspect the value of a proof from another proof, and of course #print
will also look at proof terms, but this is basically joining on a future and is relatively rare in the process
Mario Carneiro (Aug 19 2020 at 08:20):
so for the most part it is just running the definitions and theorem statements on one thread and then spawning asynchronous proof elaboration tasks
Last updated: Dec 20 2023 at 11:08 UTC