Zulip Chat Archive

Stream: lean4

Topic: Introducing FlameTC

Henrik Böving (Mar 25 2023 at 21:00):

After debugging some typeclass performance issue with Jannis yesterday I decided to write a little flamegraph visualization tool that might be helpful for people debugging performance related things to their typeclass synthesis. There are a couple of examples in the README on how you might use it. Sadly it turned out that TC synthesis does not lend itself perfectly well to the flamegraph visualization because queries usually don't go too deep but thanks to the search feature you can still use it for things like:

"Am I being slowed down by lots of failing queries?"
"Do I keep synthesizing the same goals goals and might want to apply the specialization trick for them?"

And hopefully other things people more well versed with the TC system than me can make use of^^ Have fun!

https://github.com/hargoniX/FlameTC

Sebastian Ullrich (Mar 26 2023 at 15:19):

This is nice, it would be great to get a general "component-wise" view of a Lean execution instead of the function-wise view we get from generic profilers. We already have a concept of such components with --profile, just not yet with output of start/stop time pairs.

Sebastian Ullrich (Mar 26 2023 at 15:20):

TC synthesis, as you noticed, is a bit unique because tabled resolution is an iterative algorithm in contrast to most other algorithms that are recursive, which gives rise to a natural visualization of nesting.

Henrik Böving (Apr 12 2023 at 09:17):

Thanks to: https://github.com/leanprover/lean4/pull/2181 and some modifications on FlameTC we can now properly profile not only type class synthesis but the elaboration procedure in general: https://github.com/hargoniX/Flame#example. Note that there are things that this trace does not show (yet) such as for example the time it takes for declarations to compile and how long they spend in each of the compiler passes.

Ruben Van de Velde (Apr 12 2023 at 09:51):

I wonder how hard it would be to integrate such a flamegraph in a vscode widget

Henrik Böving (Apr 12 2023 at 09:59):

I dont really know anything about vscode widgets but in principle the flamegraph that flamegraph.pl produces is just an SVG and the stuff from speedscope might even be embeddable via an iframe?

Tomas Skrivan (May 03 2023 at 23:51):

Finally got around trying it out and it is amazing! Thanks a lot for it, now I have to figure out why is my file spending 90% time on NegZeroClass.toNeg =?= SubNegMonoid.toNeg :)

Tomas Skrivan (May 03 2023 at 23:59):

Btw you can display image in the info view simply as

import ProofWidgets.Component.HtmlDisplay

open ProofWidgets Jsx

#html <img src="https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg"/>

but pointing to a file on a disk does not seem to work for me

Tomas Skrivan (May 04 2023 at 00:02):

I would be much more interested in having a widget that opens up https://www.speedscope.app/ instead of displaying the svg generated with FlameGraph. I found the svg completely useless.

Tomas Skrivan (May 04 2023 at 18:29):

I wrote a short command that allows you to write

#profile_file SciLean.Core.RealFunctions

where you specify the file the same way as with import.

It runs the profiler and opens up speedscope for you:

Screenshot-from-2023-05-04-14-27-01.png

Tomas Skrivan (May 04 2023 at 18:30):

The code

import Lean

open Lean System IO

def profileFile (file : FilePath) (flame : FilePath := "/home/tskrivan/Documents/Flame/build/bin/flame") : IO Unit := do

  let compile_output ← IO.Process.output {
    cmd := "lake"
    args := #["env", "lean", "-D", "trace.profiler=true", "-D", "pp.oneline=true", file.toString]
  }

  if compile_output.exitCode != 0 then
    throw (IO.Error.userError s!"Error: Failed to compile {file}\n\n{compile_output.stderr}")

  let flame_run ← Process.spawn {
    stdin := .piped
    stdout := .piped
    stderr := .piped
    cmd := flame.toString
    args := #[compile_output.stdout]
  }
  let (stdin, child) ← flame_run.takeStdin
  stdin.putStr compile_output.stdout
  stdin.flush
  let stdout ← IO.asTask child.stdout.readToEnd Task.Priority.dedicated
  let stderr ← child.stderr.readToEnd
  let exitCode ← child.wait
  let stdout ← IO.ofExcept stdout.get

  if exitCode != 0 then
    throw (IO.Error.userError s!"Error: Failed to run flame\n\n{stderr}")

  let speedscope_run ← Process.spawn {
    stdin := .piped
    stdout := .piped
    stderr := .piped
    cmd := "speedscope"
    args := #["-"]
  }
  let (stdin, child) ← speedscope_run.takeStdin
  stdin.putStr stdout
  stdin.flush
  let stdout ← IO.asTask child.stdout.readToEnd Task.Priority.dedicated
  let stderr ← child.stderr.readToEnd
  let exitCode ← child.wait
  let stdout ← IO.ofExcept stdout.get

  if exitCode != 0 then
    throw (IO.Error.userError s!"Error: Failed to run speedscope\n\n{stderr}")



elab " #profile_file " path:ident : command => do

  let file := (← IO.currentDir) / (path.getId.toString.replace "." FilePath.pathSeparator.toString ++ ".lean")

  profileFile file

Tomas Skrivan (May 04 2023 at 18:33):

Few points:

Path to flame is effectively hardcoded there. Anyone has an idea what would be a good way to provide path to flame?
You need to have speedscope installed locally. You can do this with npm install -g speedscope
The command #profile_file opens up speedscope each time it is elaborated, effectively each time you make a change to the file. This is really annoying, what would be a good way to run it only once?

Mario Carneiro (May 04 2023 at 19:05):

Tomas Skrivan said:

The command #profile_file opens up speedscope each time it is elaborated, effectively each time you make a change to the file. This is really annoying, what would be a good way to run it only once?

I guess you could make it a widget, such that when you put your cursor over #profile_file there is a widget with an "open speedscope" button

Tomas Skrivan (May 04 2023 at 19:23):

Good point, I will probably turn it into a widget but I'm mostly using emacs ...

Mario Carneiro (May 04 2023 at 19:25):

A more low tech approach is to have #profile_file do nothing and #profile_file! open the thing, and just add or remove the character to trigger it

Mario Carneiro (May 04 2023 at 19:25):

or equivalently, comment / uncomment the line if you have an easy key command for that

Tomas Skrivan (May 04 2023 at 19:27):

Good idea with !!

Also I want to have #profile_this_file but I need to some how break the recursion and I'm having issues using options.

Henrik Böving (May 04 2023 at 20:09):

Cool! Also if you happen to find more things that you want us to trace just say here

Andrés Goens (May 11 2023 at 13:43):

@Henrik Böving does FlameTC usually take a very long time? It's been several minutes on a relatively small file, I have a hunch that it's on an infinite loop :sweat_smile:

Henrik Böving (May 11 2023 at 13:43):

Either that or the pretty printer is choking on some large terms.

Henrik Böving (May 11 2023 at 13:43):

What's the file?

Andrés Goens (May 11 2023 at 13:46):

Henrik Böving said:

What's the file?

trace.txt

Henrik Böving (May 11 2023 at 13:48):

Ah yes, it is choking on the fact that your lines start with [fooo] here most likely, did you set the oneline option?

Andrés Goens (May 11 2023 at 13:50):

I did, but it didn't put things into 1 line, I also found that weird :sweat_smile:

Henrik Böving (May 11 2023 at 13:50):

Hm, is that intentional or a bug? @Sebastian Ullrich

Henrik Böving (May 11 2023 at 13:51):

Regardless based on the trace I would say that the profiler is not accounting for ~12 seconds of processing time here which means that we need to add additional tracing points in the compiler to figure out what's wrong with your Lean file.

Henrik Böving (May 11 2023 at 13:52):

Can you send the entire file or a #mwe? I'll try to minimize and figure out what the compiler is missing out on profiling wise.

Henrik Böving (May 11 2023 at 13:54):

This is especially interesting because it is a theorem so unlike with most other example swhere we miss out on time we cannot blame it on the old code generator.

Andrés Goens (May 11 2023 at 13:54):

Henrik Böving said:

Hm, is that intentional or a bug? Sebastian Ullrich

It definitely makes a big difference (it makes it much more compact), but it's not literally one line

Andrés Goens (May 11 2023 at 13:56):

Henrik Böving said:

Can you send the entire file or a #mwe? I'll try to minimize and figure out what the compiler is missing out on profiling wise.

cool, yeah glad to! it's this file

Andrés Goens (May 11 2023 at 13:58):

Henrik Böving said:

Regardless based on the trace I would say that the profiler is not accounting for ~12 seconds of processing time here which means that we need to add additional tracing points in the compiler to figure out what's wrong with your Lean file.

a hunch on those ~12s is that they might be meta code that's delaborating some custom syntax there. I tried with precompileModules but that didn't really fix it either

Andrés Goens (May 11 2023 at 13:58):

but it's just a hunch, I don't know how you know that there's 12s missing :grinning:

Andrés Goens (May 11 2023 at 13:59):

minimizing might be complicated because it's a performance question, I don't know exactly what to minimize

Andrés Goens (May 11 2023 at 13:59):

as in, what do we want to keep constant when minimizing

Henrik Böving (May 11 2023 at 14:22):

Okay on my laptop with the other profiler we see:

cumulative profiling times:
        attribute application 1.08ms
        compilation 22.8ms
        compilation new 43.8ms
        elaboration 195ms
        import 422ms
        initialization 28.9ms
        interpretation 1.97s
        linting 25.5ms
        parsing 1.9ms
        simp 4.5s
        tactic execution 270ms
        type checking 19.3s
        typeclass inference 140ms

so the main issues here are your gigantic simp statement but even worse the type checking which i presume happens in the kernel? Since that is in C++ it would be rather hard to trace with the tracing facilities that we use to feed flametc. So I guess we have to put additional hooks into tactic elaboration which we can actually trace nicely and just duplicate the type checking tracer.

Henrik Böving (May 11 2023 at 14:31):

Oh and as to why it is slow, I would guess it is slow because the term you are trying to type check is probably absolutely gigantic :D

Andrés Goens (May 11 2023 at 14:45):

thanks for the insight! should give us something to think about our (life?) choices there :sweat_smile: I guess going back to the question about FlameTC, it's because of the oneline thing?

Henrik Böving (May 11 2023 at 14:51):

Yes I think the parser is getting confused by the things that start with [...] I did not design it to operate properly against non valid input :D

Andrés Goens (May 11 2023 at 15:29):

Henrik Böving said:

Yes I think the parser is getting confused by the things that start with [...] I did not design it to operate properly against non valid input :D

why is it non-valid? am I doing something wrong, besides trying to prove theorems about nasty big terms with nasty big simps of course :sweat_smile: ?

Andrés Goens (May 11 2023 at 15:32):

Henrik Böving said:

Okay on my laptop with the other profiler we see:
cumulative profiling times:
        attribute application 1.08ms
        compilation 22.8ms
        compilation new 43.8ms
        elaboration 195ms
        import 422ms
        initialization 28.9ms
        interpretation 1.97s
        linting 25.5ms
        parsing 1.9ms
        simp 4.5s
        tactic execution 270ms
        type checking 19.3s
        typeclass inference 140ms
so the main issues here are your gigantic simp statement but even worse the type checking which i presume happens in the kernel? Since that is in C++ it would be rather hard to trace with the tracing facilities that we use to feed flametc. So I guess we have to put additional hooks into tactic elaboration which we can actually trace nicely and just duplicate the type checking tracer.

An interesting thing is that if I set_option pp.all true and copy-paste what #print gives me for the theorem's term (i.e. avoid the delaboration and all that), then typechecking is much, much faster:

cumulative profiling times:
    attribute application 0.00187ms
    compilation 3.33ms
    compilation new 0.514ms
    elaboration 4.08s
    import 396ms
    initialization 22.1ms
    interpretation 1.17s
    linting 0.133ms
    parsing 14.1s
    type checking 1.3ms
    typeclass inference 3.19s

~~Of course we pay a big price with parsing the huge terms (expectedly), but it does mean that somewhere we're doing an unnecessarily large number of calls to the kernel~~

Andrés Goens (May 11 2023 at 15:37):

Oh, nevermind, I somehow missed 100s of error messages above that saying that Lean couldn't figure out what that was

Henrik Böving (May 11 2023 at 15:37):

Andrés Goens said:

Henrik Böving said:

Yes I think the parser is getting confused by the things that start with [...] I did not design it to operate properly against non valid input :D

why is it non-valid? am I doing something wrong, besides trying to prove theorems about nasty big terms with nasty big simps of course :sweat_smile: ?

Well basically it expects lines that start with [....] To be trace lines with a time attached and so on. But obviously that is not the case

Andrés Goens (May 11 2023 at 15:45):

oh, never mind, I was too happy about the numbers to notice the 100s of messages above them telling me it failed :see_no_evil:

Henrik Böving (May 11 2023 at 17:55):

https://github.com/leanprover/lean4/pull/2211

Sebastian Ullrich (May 11 2023 at 19:17):

pp.oneline only affects the direct pretty printer output, not the combined Message. When it is set, we might want a different Elab.step output in any case as seeing only the expected type is not that informative

Sebastian Ullrich (May 11 2023 at 19:18):

But also FlameTC should probably ignore lines that are not of the format [trace.class] [1.2345s] ...

Henrik Böving (May 11 2023 at 20:19):

@Andrés Goens I fixed the parsing (again!) and it is now fully capable of processing your file, although as predicted you won't see much with a current Lean release^^

Last updated: May 02 2025 at 03:31 UTC