Zulip Chat Archive

Stream: Lean Together 2024

Topic: Are We Fast Yet - Sebastian Ullrich


Joachim Breitner (Jan 09 2024 at 14:34):

I’m curious: What does https://github.com/digama0/leangz do (over just gz)?

Johan Commelin (Jan 09 2024 at 14:36):

cc @Mario Carneiro

Kim Morrison (Jan 09 2024 at 14:38):

Sebastian's "one more thing" moment at the end of the talk... :-)

Joachim Breitner (Jan 09 2024 at 14:41):

If going from lean3 to lean4 was a 50% performance improvement, and a full build now takes ~100mins, does it mean that with lean14, we will be down to 6s for the build?

Kevin Buzzard (Jan 09 2024 at 14:43):

When I first raised the question of Lean 5, the look I got from the devs made me realise that we shouldn't be holding our breath :-)

Alex J. Best (Jan 09 2024 at 14:43):

Joachim Breitner said:

I’m curious: What does https://github.com/digama0/leangz do (over just gz)?

https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/leangz.3A.20olean.20file.20compressor/near/368896955 is the original thread, not sure if the state of the art has improved since that

Mario Carneiro (Jan 09 2024 at 14:43):

@Joachim Breitner We can achieve about a 6x reduction in size using a lossless re-encoding of the file to remove all the lean_object overhead (e.g. every lean object starts with a 4 byte RC which is always 0 in olean files), as well as removing the hashes for Expr and Name, which make it much more compression-friendly. Then we layer zstd compression on top (the same compression used for compressing lean 3 files), with a custom dictionary optimized for the kind of text that shows up in oleans, which raises the compression factor to about 20x

Mario Carneiro (Jan 09 2024 at 14:45):

It has not significantly changed since the original thread, although it has had to evolve with changes to the olean format (it supports all olean format variations since it was released)

Mauricio Collares (Jan 09 2024 at 14:48):

Off-topic: When AWFY started, it contained a big "NO" on the front page because SpiderMonkey wasn't as fast as V8. The Internet Archive saved it, but their version looks really funny because they didn't save the actual performance graphs: https://web.archive.org/web/20100413010632/http://arewefastyet.com/

Sebastian Ullrich (Jan 09 2024 at 14:49):

Hah, thanks for that interesting historical note!

Pedro Sánchez Terraf (Jan 09 2024 at 14:51):

Question, in the Lean 4.3.0-rc2 (I think?) there was an 8% timing enhancement, but I seemed to see one of the curves notably increasing. Which one was it?

Sebastian Ullrich (Jan 09 2024 at 14:53):

Great observation. I'd have to check again but I believe there was some time moved from interpretation to another metric, which is expected to some degree

Emilie Uthaiwat (Jan 09 2024 at 15:45):

I apologize in advance if this is a stupid question but could someone please tell me how to download the files for the LeanFlame tool? I got the error message

bash: ./build/bin/flame: No such file or directory

Mauricio Collares (Jan 09 2024 at 16:11):

./build/bin/flame is now .lake/build/bin/flame

Mauricio Collares (Jan 09 2024 at 16:12):

If the toolchain for flame was recently updated, that is

Mauricio Collares (Jan 09 2024 at 16:13):

It wasn't, though, sorry for the noise. Did you get an error running lake build inside the Flame repo checkout?

Emilie Uthaiwat (Jan 09 2024 at 21:05):

So, I first need to checkout using the command

git clone https://github.com/hargoniX/Flame.git

Is this correct?

Emilie Uthaiwat (Jan 09 2024 at 21:17):

I didn't get any error after running

lake build

inside the Flame repository. Then, I typed the command

$ cat trace.txt | ./build/bin/flame > out.txt

and got this message:

cat: trace.txt: No such file or directory

Kevin Buzzard (Jan 09 2024 at 21:20):

I guess you're supposed to be supplying that file as the user?

Henrik Böving (Jan 09 2024 at 21:20):

As the README of the repository says, you need to record some output of your infoview with the proper lean options set and put it in that file.

Emilie Uthaiwat (Jan 09 2024 at 21:36):

The terminal is panicking:

PANIC at Flame.Position.down! Flame.Analyze:37:10: Cannot go further downn

Emilie Uthaiwat (Jan 09 2024 at 21:37):

I just copied and pasted from the Lean infoview but there was perhaps a proper way to format the .txt file.

Henrik Böving (Jan 09 2024 at 21:39):

Emilie Uthaiwat said:

I just copied and pasted from the Lean infoview but there was perhaps a proper way to format the .txt file.

whats in your file?

Emilie Uthaiwat (Jan 09 2024 at 21:40):

The file is a bit long so I'm just copying and pasting a part of it. It looks like this:

[Elab.command] [0.560049s] theorem Prod.associated_iff {M N : Type*} [Monoid M] [Monoid N] {x z : M × N} : [...] 
  [] [0.559672s] theorem associated_iff {M N : Type*} [Monoid M] [Monoid N] {x z : M × N} : [...] 
    [step] [0.029766s] expected type: Sort ?u.2842, term
        Type*
    [step] [0.026438s] expected type: Prop, term
        Associated x z  Associated x.1 z.1  Associated x.2 z.2 
      [] [0.026402s] expected type: Prop, term

Mauricio Collares (Jan 09 2024 at 21:42):

Is this with the set_option pp.oneline true option set?

Emilie Uthaiwat (Jan 09 2024 at 21:44):

Yes, I enabled it in my code.

Mauricio Collares (Jan 09 2024 at 21:46):

Emilie Uthaiwat said:

I just copied and pasted from the Lean infoview but there was perhaps a proper way to format the .txt file.

Can you try capturing the output from the command line? Using a command like lake env lean -D trace.profiler=true -D pp.oneline=true yourfile.lean > trace.txt

Mauricio Collares (Jan 09 2024 at 21:47):

This ensures the output does not get truncated, which could lead to errors like the panic you mentioned

Mauricio Collares (Jan 09 2024 at 21:47):

You can run it from VS Code's terminal if you prefer

Emilie Uthaiwat (Jan 09 2024 at 21:49):

I get:

invalid -D parameter, unknown configuration option 'trace.profiler'

Henrik Böving (Jan 09 2024 at 21:49):

What's the version that's in your lean-toolchain file in the root of your repository?

Mauricio Collares (Jan 09 2024 at 21:49):

If you do have the two set_option commands in your file, you can omit the -D parameters (that is, lake env lean yourfile.lean > trace.txt is enough)

Emilie Uthaiwat (Jan 09 2024 at 21:54):

Henrik Böving said:

What's the version that's in your lean-toolchain file in the root of your repository?

It's

leanprover/lean4:nightly-2023-02-12

Emilie Uthaiwat (Jan 09 2024 at 21:54):

Mauricio Collares said:

If you do have the two set_option commands in your file, you can omit the -D parameters (that is, lake env lean yourfile.lean > trace.txt is enough)

This didn't give me any error message!

Henrik Böving (Jan 09 2024 at 21:58):

Emilie Uthaiwat said:

Henrik Böving said:

What's the version that's in your lean-toolchain file in the root of your repository?

It's

leanprover/lean4:nightly-2023-02-12

are you sure that is your repository and not mine.

Mauricio Collares (Jan 09 2024 at 22:02):

If you didn't get an error message, you should have a trace.txt file in your project folder. You can then run cat trace.txt | ../relative/path/here/build/bin/flame > out.txt to have flame process the trace

Mauricio Collares (Jan 09 2024 at 22:03):

For example, if your project folder and the flame checkout have the same parent folder, that means cat trace.txt | ../Flame/build/bin/flame > out.txt from your project folder

Emilie Uthaiwat (Jan 09 2024 at 22:05):

Henrik Böving said:

Emilie Uthaiwat said:

Henrik Böving said:

What's the version that's in your lean-toolchain file in the root of your repository?

It's

leanprover/lean4:nightly-2023-02-12

are you sure that is your repository and not mine.

Indeed... In my repository, it's

leanprover/lean4:v4.0.0

I'm sorry.

Emilie Uthaiwat (Jan 09 2024 at 22:06):

Mauricio Collares said:

If you didn't get an error message, you should have a trace.txt file in your project folder. You can then run cat trace.txt | ../relative/path/here/build/bin/flame > out.txt to have flame process the trace

The .txt file looks odd. It starts with

Absorbing.lean:1:0: error: unknown package 'Mathlib'
Absorbing.lean:12:18: error: invalid binder annotation, type is not a class instance
  ?m.18

Mauricio Collares (Jan 09 2024 at 22:15):

The original command (the one starting with lake env), has to be run inside the project folder (i.e. you might have to use cd to change the current folder), because otherwise your project configuration (Lakefile, etc) will not be picked up.

Emilie Uthaiwat (Jan 09 2024 at 22:35):

It worked and looks great on speedscope!
Capture.PNG

Emilie Uthaiwat (Jan 09 2024 at 22:36):

Thank you for your help and patience!

Johan Commelin (Jan 10 2024 at 07:04):

It would be really cool if someone could write a blogpost/documentation on how to produce such pictures!

Henrik Böving (Jan 10 2024 at 08:09):

@Sebastian Ullrich should I write about that on my own blog or do we have some spot in our docs where this would fit already?

Sebastian Ullrich (Jan 10 2024 at 09:04):

We may have some spot for that soon but let's first talk through the improvements I have in mind

Kevin Buzzard (Jan 10 2024 at 09:25):

There's always the community blog. I for one would love to hear about what if anything is represented by the (a) length (b) depth and (c) colour of the rectangles -- I made one of these flame graphs once and it was only after making it that I realised I had absolutely no idea what I was looking at.

Henrik Böving (Jan 10 2024 at 09:33):

Just for quick reference:
a) How much time was spend in the thing that the block represents (this includes the time of all the stuff above the line so time that is "actually spent by this specific thing" is the free space above lines.
b) That's just the hierarchy of function calls or subprocesses.
c) the colour is in fact random to make distinguishing things easier

Henrik Böving (Jan 10 2024 at 09:34):

This flame graph stuff is by no means original work btw, it's by Brendan Gregg: https://brendangregg.com/flamegraphs.html who is one of the guys when it comes to performance engineering.

Mauricio Collares (Jan 10 2024 at 10:08):

Said it another way, if you "project the flame graph down" on the real line, you get, for each function call/tactic execution, a time interval corresponding to the time between when it started executing and when it finished. If a function f calls a function g, the interval for g is nested inside the interval for f in this projection, because f is certainly not done executing if g hasn't finished. That's what the second dimension is for: In this case, g will appear one line below f. So if you pick a particular vertical line x = t, the bottommost intersection with this line is the function/tactic g that's actually running at time t, the thing immediately above it is waiting for g to finish executing, and so on.

(Nitpick: Technically I am describing flame charts, but they're the same as flame graphs for our purposes)

Bolton Bailey (Jan 12 2024 at 06:21):

I just want to comment that I am a big fan of the concept of a tactic proof not restarting from the beginning when I change something. I have been working on sone rather long (~200 second) proofs where having this functionality would be very convenient.

Mario Carneiro (Jan 12 2024 at 06:29):

I have a hacky implementation of a one-tactic-at-a-time syntax here: https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/proof.20mode.3A.20split.20out.20tactics.20into.20separate.20commands/near/384324747

Emilie Uthaiwat (Jan 12 2024 at 11:51):

Capture.PNG

It seems to me that there are missing calls in my flame graph. For example, around 30,000, there are 3 stacks but the infoview indicates at least 3 calls starting with

expected type: Prop, term

Are the calls "filtered" in some way?

Sebastian Ullrich (Jan 12 2024 at 12:19):

Could you post that part of the infoview?

Emilie Uthaiwat (Jan 12 2024 at 13:36):

Capture.PNG
The calls of the form "expected type" start on the 5th line.

Sebastian Ullrich (Jan 12 2024 at 13:39):

That's very close to trace.profiler.threshold so it's likely that running your file on the cmdline simply produced different output

Emilie Uthaiwat (Jan 12 2024 at 16:56):

Alright, thank you!

Mauricio Collares (Jan 12 2024 at 17:04):

You can do set_option trace.profiler.threshold 5 and regenerate the trace, though (I think the default is 10?)

Emilie Uthaiwat (Jan 12 2024 at 17:52):

Capture.PNG
It gives me this flame graph.

Emilie Uthaiwat (Jan 12 2024 at 17:58):

Capture_infoview.PNG
It depicts more information, even though it doesn't exactly match the infoview (as can be seen from the 6th line).

Emilie Uthaiwat (Jan 12 2024 at 18:00):

There's no stack corresponding to

apply @NonAssocRing.toNonUnitalNonAssocRing to NonUnitalNonAssocRing M

between 80,000 and 100,000.


Last updated: May 02 2025 at 03:31 UTC