Zulip Chat Archive
Stream: lean4
Topic: Performance of terms after erasure
James Gallicchio (Jul 10 2022 at 17:13):
I'm trying to do something a bit weird and implement a List.fold'
(that exposes proof of membership) in terms of just fold
; I am using some trickery to push off the proof obligations until outside the fold (where they can actually be dispatched) but it tanks the performance of the code
structure Partial (α : Type u) where
dom : Prop
get : dom → α
def fold' (l : List τ) (f : β → (x : τ) → x ∈ l → β) (acc : β) : β :=
let temp := List.foldl
(fun ⟨dom,acc⟩ x => ⟨x ∈ l ∧ dom, λ ⟨h_x, h⟩ => f (acc h) x h_x⟩)
(⟨True, λ _ => acc⟩ : Partial β)
l
temp.get (by
have : temp.dom = List.foldl (fun acc x => x ∈ l ∧ acc) True l := by
sorry
rw [this]
sorry
)
James Gallicchio (Jul 10 2022 at 17:15):
I was hoping that the Partial
gets erased to just () → α
, and that the fold closure fun ⟨dom,acc⟩ x => ⟨x ∈ l ∧ dom, λ ⟨h_x, h⟩ => f (acc h) x h_x⟩
would be simplified to fun acc x () => f (acc ()) x
, does that all seem accurate? not entirely sure how to decifer the generated C code
James Gallicchio (Jul 10 2022 at 17:18):
Alternatively, does anyone have a better idea how to implement fold'
in terms of foldl
that at runtime just makes a single call to foldl
?
Yakov Pechersky (Jul 10 2022 at 17:42):
Take a look at how lean3 does it for docs#list.pmap and docs#list.attach
Wojciech Nawrocki (Jul 10 2022 at 17:53):
Not really an answer, but you might find this thread relevant.
James Gallicchio (Jul 10 2022 at 18:07):
ah, that is relevant. hm
James Gallicchio (Jul 10 2022 at 18:30):
So terms like fun x h => f x (by blah)
are not simplified to f
James Gallicchio (Jul 10 2022 at 18:31):
I will play around with this a bit more, thanks for the links :)
Wojciech Nawrocki (Jul 10 2022 at 19:21):
Are h
and (by blah)
supposed to prove the same proposition, or two definitionally equal ones? If so, then I guess you could simplify it to f
by proof irrelevance. But if not, it wouldn't be type-correct. (Or you mean in the compiler? It does erase proofs but not always, I think, so maybe this is one of the non-erased cases.)
Sebastian Ullrich (Jul 10 2022 at 19:30):
Proofs should always be erased
Sebastian Ullrich (Jul 10 2022 at 19:43):
Half of the performance difference is due to a missing @[specialize]
. After that, the issue is probably that () → α
is still a closure that is composed and then invoked only after the fold.
James Gallicchio (Jul 16 2022 at 19:24):
Ah, thank you for the tip wrt specialize
. But the cost of the closures was what I expected to be quite high -- I still can't figure out a way to avoid building up the list (or an equivalent stack of closures) while defining fold'
. Duplicating the with-proof and without proof versions is annoying but a fine workaround
Kyle Miller (Jul 16 2022 at 19:35):
If I understand your test correctly, you're mostly comparing the compiled performance of List.fold
against the VM performance of your List.fold'
. Maybe try compiling your version and comparing against that? (For example, by having an actual main
function with your test and building it.)
James Gallicchio (Jul 16 2022 at 19:36):
Wojciech Nawrocki said:
Are
h
and(by blah)
supposed to prove the same proposition, or two definitionally equal ones? If so, then I guess you could simplify it tof
by proof irrelevance. But if not, it wouldn't be type-correct. (Or you mean in the compiler? It does erase proofs but not always, I think, so maybe this is one of the non-erased cases.)
No, I mean in the non-type-correct sense. Which is why it'd have to be erased & simplified after typechecking. And of course it's not always safe to do that simplification, right, since fun (h : False) => False.elim h
can't be simplified toFalse.elim
since there is no actual code to generate for False.elim
.
So I'm not really sure what I'd even envision doing to get this function to be equivalent to a single fold
(aside from the optimization mentioned in the other thread w.r.t. identifying functions that are identity after proof erasure). Definitely an engineering issue not a Lean compiler issue.
James Gallicchio (Jul 16 2022 at 19:40):
I wonder if there's some way to "seal" some unsafe post-erasure simplifications in a surrounding closure to ensure that proof obligations were met before code gets executed. Like, I think it is safe to simplify
fun () => (fun (h : P) => x)
to
fun () => x
when erasing proof terms, since the exterior function's body will only be executed if its application typechecks.
James Gallicchio (Jul 16 2022 at 19:42):
Kyle Miller said:
If I understand your test correctly, you're mostly comparing the compiled performance of
List.fold
against the VM performance of yourList.fold'
. Maybe try compiling your version and comparing against that? (For example, by having an actualmain
function with your test and building it.)
sorry, missed your message. isn't this comparing the VM performance of both? the test is running both in an #eval, but does the List.foldl
call get exported to the compiled function?
Wojciech Nawrocki (Jul 16 2022 at 20:56):
James Gallicchio said:
Kyle Miller said:
If I understand your test correctly, you're mostly comparing the compiled performance of
List.fold
against the VM performance of yourList.fold'
. Maybe try compiling your version and comparing against that? (For example, by having an actualmain
function with your test and building it.)sorry, missed your message. isn't this comparing the VM performance of both? the test is running both in an #eval, but does the
List.foldl
call get exported to the compiled function?
Oh yes, this is important. List.foldl
is part of stdlib so it gets compiled into liblean
and will be executed in compiled form, whereas your fold'
is interpreted. You could try writing the benchmark as a def main : IO Unit
and running the compiled executable.
Wojciech Nawrocki (Jul 16 2022 at 20:58):
(Note: the calls into Nvm, because of specialization it seems like it might all be interpreted. You can introspect the code with List.foldl
from fold'
will also be to the compiled code.)set_option trace.compiler.ir true in def fold' ..
.
Wojciech Nawrocki (Jul 16 2022 at 21:04):
Oh, and it looks like the List.foldl
in the timeit
is also specialized, thus interpreted. So the comparison might be fair after all? Sorry about the confusion!
James Gallicchio (Jul 17 2022 at 01:05):
Wait, specialization --> #eval interprets instead of calling out to compiled code? wacky. I'll try out making it main in a sec
James Gallicchio (Jul 17 2022 at 01:14):
Okay, with the fold specialized and running it fully compiled, the timings are
fold 2.02ms
fold' 20.6ms
which, yeah, about as expected
James Gallicchio (Jul 17 2022 at 01:44):
Okay, implementing it directly gives exactly the same performance as regular fold
@[specialize]
def fold' (l : List τ) (f : β → (x : τ) → x ∈ l → β) (acc : β) : β :=
let rec @[specialize] go (rest : List τ) (acc : β) (h : ∀ x, x ∈ rest → x ∈ l) : β :=
match rest with
| [] => acc
| x::xs => go xs (f acc x (h x (List.Mem.head _ _)))
(by intros x hxs; exact h _ (List.Mem.tail _ hxs))
go l acc (by intros; trivial)
(And the generated IR is identical to foldl, albeit with a couple extra unused parameters that aren't being erased)
Mario Carneiro (Jul 17 2022 at 04:13):
James Gallicchio said:
Wait, specialization --> #eval interprets instead of calling out to compiled code? wacky. I'll try out making it main in a sec
cc: @Leonardo de Moura ^ this is an interesting interaction of features
Gabriel Ebner (Jul 17 2022 at 06:15):
And it's unfortunately a well-known issue: https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/include.20lean.2Eh/near/264881145
Sebastian Ullrich (Jul 17 2022 at 08:11):
I don't see what we could do about it. Specialization generates new code, yes. Fortunately, it is now easier to run compiled code than ever.
Sebastian Ullrich (Jul 17 2022 at 08:12):
That is, you could set precompileModules
in your lakefile and then move the #eval
s to a different file. No need to leave the editor anymore.
Mac (Jul 17 2022 at 08:43):
Sebastian Ullrich said:
That is, you could set
precompileModules
in your lakefile and then move the#eval
s to a different file. No need to leave the editor anymore.
Note that this still won't work if it is the #eval
code that is doing the specializing.
Mario Carneiro (Jul 17 2022 at 13:47):
Yeah, the problem here is specifically the fact that specialization only uses lemmas that have been generated in the current file, which means that they will move along with the #eval
Last updated: Dec 20 2023 at 11:08 UTC