Zulip Chat Archive
Stream: lean4
Topic: Introducing FlameTC
Henrik Böving (Mar 25 2023 at 21:00):
After debugging some typeclass performance issue with Jannis yesterday I decided to write a little flamegraph visualization tool that might be helpful for people debugging performance related things to their typeclass synthesis. There are a couple of examples in the README on how you might use it. Sadly it turned out that TC synthesis does not lend itself perfectly well to the flamegraph visualization because queries usually don't go too deep but thanks to the search feature you can still use it for things like:
- "Am I being slowed down by lots of failing queries?"
- "Do I keep synthesizing the same goals goals and might want to apply the specialization trick for them?"
And hopefully other things people more well versed with the TC system than me can make use of^^ Have fun!
https://github.com/hargoniX/FlameTC
Sebastian Ullrich (Mar 26 2023 at 15:19):
This is nice, it would be great to get a general "component-wise" view of a Lean execution instead of the function-wise view we get from generic profilers. We already have a concept of such components with --profile
, just not yet with output of start/stop time pairs.
Sebastian Ullrich (Mar 26 2023 at 15:20):
TC synthesis, as you noticed, is a bit unique because tabled resolution is an iterative algorithm in contrast to most other algorithms that are recursive, which gives rise to a natural visualization of nesting.
Henrik Böving (Apr 12 2023 at 09:17):
Thanks to: https://github.com/leanprover/lean4/pull/2181 and some modifications on FlameTC we can now properly profile not only type class synthesis but the elaboration procedure in general: https://github.com/hargoniX/Flame#example. Note that there are things that this trace does not show (yet) such as for example the time it takes for declarations to compile and how long they spend in each of the compiler passes.
Ruben Van de Velde (Apr 12 2023 at 09:51):
I wonder how hard it would be to integrate such a flamegraph in a vscode widget
Henrik Böving (Apr 12 2023 at 09:59):
I dont really know anything about vscode widgets but in principle the flamegraph that flamegraph.pl produces is just an SVG and the stuff from speedscope might even be embeddable via an iframe?
Tomas Skrivan (May 03 2023 at 23:51):
Finally got around trying it out and it is amazing! Thanks a lot for it, now I have to figure out why is my file spending 90% time on NegZeroClass.toNeg =?= SubNegMonoid.toNeg
:)
Tomas Skrivan (May 03 2023 at 23:59):
Btw you can display image in the info view simply as
import ProofWidgets.Component.HtmlDisplay
open ProofWidgets Jsx
#html <img src="https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg"/>
but pointing to a file on a disk does not seem to work for me
Tomas Skrivan (May 04 2023 at 00:02):
I would be much more interested in having a widget that opens up https://www.speedscope.app/ instead of displaying the svg generated with FlameGraph. I found the svg completely useless.
Tomas Skrivan (May 04 2023 at 18:29):
I wrote a short command that allows you to write
#profile_file SciLean.Core.RealFunctions
where you specify the file the same way as with import
.
It runs the profiler and opens up speedscope for you:
Screenshot-from-2023-05-04-14-27-01.png
Tomas Skrivan (May 04 2023 at 18:30):
The code
import Lean
open Lean System IO
def profileFile (file : FilePath) (flame : FilePath := "/home/tskrivan/Documents/Flame/build/bin/flame") : IO Unit := do
let compile_output ← IO.Process.output {
cmd := "lake"
args := #["env", "lean", "-D", "trace.profiler=true", "-D", "pp.oneline=true", file.toString]
}
if compile_output.exitCode != 0 then
throw (IO.Error.userError s!"Error: Failed to compile {file}\n\n{compile_output.stderr}")
let flame_run ← Process.spawn {
stdin := .piped
stdout := .piped
stderr := .piped
cmd := flame.toString
args := #[compile_output.stdout]
}
let (stdin, child) ← flame_run.takeStdin
stdin.putStr compile_output.stdout
stdin.flush
let stdout ← IO.asTask child.stdout.readToEnd Task.Priority.dedicated
let stderr ← child.stderr.readToEnd
let exitCode ← child.wait
let stdout ← IO.ofExcept stdout.get
if exitCode != 0 then
throw (IO.Error.userError s!"Error: Failed to run flame\n\n{stderr}")
let speedscope_run ← Process.spawn {
stdin := .piped
stdout := .piped
stderr := .piped
cmd := "speedscope"
args := #["-"]
}
let (stdin, child) ← speedscope_run.takeStdin
stdin.putStr stdout
stdin.flush
let stdout ← IO.asTask child.stdout.readToEnd Task.Priority.dedicated
let stderr ← child.stderr.readToEnd
let exitCode ← child.wait
let stdout ← IO.ofExcept stdout.get
if exitCode != 0 then
throw (IO.Error.userError s!"Error: Failed to run speedscope\n\n{stderr}")
elab " #profile_file " path:ident : command => do
let file := (← IO.currentDir) / (path.getId.toString.replace "." FilePath.pathSeparator.toString ++ ".lean")
profileFile file
Tomas Skrivan (May 04 2023 at 18:33):
Few points:
- Path to flame is effectively hardcoded there. Anyone has an idea what would be a good way to provide path to flame?
- You need to have speedscope installed locally. You can do this with
npm install -g speedscope
- The command
#profile_file
opens up speedscope each time it is elaborated, effectively each time you make a change to the file. This is really annoying, what would be a good way to run it only once?
Mario Carneiro (May 04 2023 at 19:05):
Tomas Skrivan said:
- The command
#profile_file
opens up speedscope each time it is elaborated, effectively each time you make a change to the file. This is really annoying, what would be a good way to run it only once?
I guess you could make it a widget, such that when you put your cursor over #profile_file
there is a widget with an "open speedscope" button
Tomas Skrivan (May 04 2023 at 19:23):
Good point, I will probably turn it into a widget but I'm mostly using emacs ...
Mario Carneiro (May 04 2023 at 19:25):
A more low tech approach is to have #profile_file
do nothing and #profile_file!
open the thing, and just add or remove the character to trigger it
Mario Carneiro (May 04 2023 at 19:25):
or equivalently, comment / uncomment the line if you have an easy key command for that
Tomas Skrivan (May 04 2023 at 19:27):
Good idea with !
!
Also I want to have #profile_this_file
but I need to some how break the recursion and I'm having issues using options.
Henrik Böving (May 04 2023 at 20:09):
Cool! Also if you happen to find more things that you want us to trace just say here
Andrés Goens (May 11 2023 at 13:43):
@Henrik Böving does FlameTC usually take a very long time? It's been several minutes on a relatively small file, I have a hunch that it's on an infinite loop :sweat_smile:
Henrik Böving (May 11 2023 at 13:43):
Either that or the pretty printer is choking on some large terms.
Henrik Böving (May 11 2023 at 13:43):
What's the file?
Andrés Goens (May 11 2023 at 13:46):
Henrik Böving said:
What's the file?
Henrik Böving (May 11 2023 at 13:48):
Ah yes, it is choking on the fact that your lines start with [fooo]
here most likely, did you set the oneline option?
Andrés Goens (May 11 2023 at 13:50):
I did, but it didn't put things into 1 line, I also found that weird :sweat_smile:
Henrik Böving (May 11 2023 at 13:50):
Hm, is that intentional or a bug? @Sebastian Ullrich
Henrik Böving (May 11 2023 at 13:51):
Regardless based on the trace I would say that the profiler is not accounting for ~12 seconds of processing time here which means that we need to add additional tracing points in the compiler to figure out what's wrong with your Lean file.
Henrik Böving (May 11 2023 at 13:52):
Can you send the entire file or a #mwe? I'll try to minimize and figure out what the compiler is missing out on profiling wise.
Henrik Böving (May 11 2023 at 13:54):
This is especially interesting because it is a theorem so unlike with most other example swhere we miss out on time we cannot blame it on the old code generator.
Andrés Goens (May 11 2023 at 13:54):
Henrik Böving said:
Hm, is that intentional or a bug? Sebastian Ullrich
It definitely makes a big difference (it makes it much more compact), but it's not literally one line
Andrés Goens (May 11 2023 at 13:56):
Henrik Böving said:
Can you send the entire file or a #mwe? I'll try to minimize and figure out what the compiler is missing out on profiling wise.
cool, yeah glad to! it's this file
Andrés Goens (May 11 2023 at 13:58):
Henrik Böving said:
Regardless based on the trace I would say that the profiler is not accounting for ~12 seconds of processing time here which means that we need to add additional tracing points in the compiler to figure out what's wrong with your Lean file.
a hunch on those ~12s is that they might be meta code that's delaborating some custom syntax there. I tried with precompileModules
but that didn't really fix it either
Andrés Goens (May 11 2023 at 13:58):
but it's just a hunch, I don't know how you know that there's 12s missing :grinning:
Andrés Goens (May 11 2023 at 13:59):
minimizing might be complicated because it's a performance question, I don't know exactly what to minimize
Andrés Goens (May 11 2023 at 13:59):
as in, what do we want to keep constant when minimizing
Henrik Böving (May 11 2023 at 14:22):
Okay on my laptop with the other profiler we see:
cumulative profiling times:
attribute application 1.08ms
compilation 22.8ms
compilation new 43.8ms
elaboration 195ms
import 422ms
initialization 28.9ms
interpretation 1.97s
linting 25.5ms
parsing 1.9ms
simp 4.5s
tactic execution 270ms
type checking 19.3s
typeclass inference 140ms
so the main issues here are your gigantic simp statement but even worse the type checking which i presume happens in the kernel? Since that is in C++ it would be rather hard to trace with the tracing facilities that we use to feed flametc. So I guess we have to put additional hooks into tactic elaboration which we can actually trace nicely and just duplicate the type checking tracer.
Henrik Böving (May 11 2023 at 14:31):
Oh and as to why it is slow, I would guess it is slow because the term you are trying to type check is probably absolutely gigantic :D
Andrés Goens (May 11 2023 at 14:45):
thanks for the insight! should give us something to think about our (life?) choices there :sweat_smile: I guess going back to the question about FlameTC, it's because of the oneline
thing?
Henrik Böving (May 11 2023 at 14:51):
Yes I think the parser is getting confused by the things that start with [...]
I did not design it to operate properly against non valid input :D
Andrés Goens (May 11 2023 at 15:29):
Henrik Böving said:
Yes I think the parser is getting confused by the things that start with
[...]
I did not design it to operate properly against non valid input :D
why is it non-valid? am I doing something wrong, besides trying to prove theorems about nasty big terms with nasty big simp
s of course :sweat_smile: ?
Andrés Goens (May 11 2023 at 15:32):
Henrik Böving said:
Okay on my laptop with the other profiler we see:
cumulative profiling times: attribute application 1.08ms compilation 22.8ms compilation new 43.8ms elaboration 195ms import 422ms initialization 28.9ms interpretation 1.97s linting 25.5ms parsing 1.9ms simp 4.5s tactic execution 270ms type checking 19.3s typeclass inference 140ms
so the main issues here are your gigantic simp statement but even worse the type checking which i presume happens in the kernel? Since that is in C++ it would be rather hard to trace with the tracing facilities that we use to feed flametc. So I guess we have to put additional hooks into tactic elaboration which we can actually trace nicely and just duplicate the type checking tracer.
An interesting thing is that if I :set_option pp.all true
and copy-paste what #print
gives me for the theorem's term (i.e. avoid the delaboration and all that), then typechecking is much, much faster
cumulative profiling times:
attribute application 0.00187ms
compilation 3.33ms
compilation new 0.514ms
elaboration 4.08s
import 396ms
initialization 22.1ms
interpretation 1.17s
linting 0.133ms
parsing 14.1s
type checking 1.3ms
typeclass inference 3.19s
Of course we pay a big price with parsing the huge terms (expectedly), but it does mean that somewhere we're doing an unnecessarily large number of calls to the kernel
Andrés Goens (May 11 2023 at 15:37):
Oh, nevermind, I somehow missed 100s of error messages above that saying that Lean couldn't figure out what that was
Henrik Böving (May 11 2023 at 15:37):
Andrés Goens said:
Henrik Böving said:
Yes I think the parser is getting confused by the things that start with
[...]
I did not design it to operate properly against non valid input :Dwhy is it non-valid? am I doing something wrong, besides trying to prove theorems about nasty big terms with nasty big
simp
s of course :sweat_smile: ?
Well basically it expects lines that start with [....] To be trace lines with a time attached and so on. But obviously that is not the case
Andrés Goens (May 11 2023 at 15:45):
oh, never mind, I was too happy about the numbers to notice the 100s of messages above them telling me it failed :see_no_evil:
Henrik Böving (May 11 2023 at 17:55):
https://github.com/leanprover/lean4/pull/2211
Sebastian Ullrich (May 11 2023 at 19:17):
pp.oneline
only affects the direct pretty printer output, not the combined Message
. When it is set, we might want a different Elab.step
output in any case as seeing only the expected type is not that informative
Sebastian Ullrich (May 11 2023 at 19:18):
But also FlameTC should probably ignore lines that are not of the format [trace.class] [1.2345s] ...
Henrik Böving (May 11 2023 at 20:19):
@Andrés Goens I fixed the parsing (again!) and it is now fully capable of processing your file, although as predicted you won't see much with a current Lean release^^
Last updated: Dec 20 2023 at 11:08 UTC