Zulip Chat Archive
Stream: general
Topic: Implementation of category.fold.to_list
Jannis Limperg (Apr 03 2020 at 11:02):
In category.fold
, to_list
is defined as
def to_list : t α → list α := list.reverse ∘ foldl (flip list.cons) []
where t
is an arbitrary traversable
. However, I would have expected foldr list.cons []
instead. Is there a reason to prefer the first implementation? (@Simon Hudon I believe this is your area of expertise.)
Marc Huisinga (Apr 03 2020 at 11:06):
(deleted)
Marc Huisinga (Apr 03 2020 at 11:12):
actually, scratch that, sorry.
Reid Barton (Apr 03 2020 at 14:12):
My guess is this one runs in fixed stack space in a strict language
Reid Barton (Apr 03 2020 at 14:12):
now I see it's approximately what Marc wrote and then deleted, so maybe I am wrong too :upside_down:
Simon Hudon (Apr 03 2020 at 15:10):
That was actually my reason. Why did @Marc Huisinga abandon the explanation?
Marc Huisinga (Apr 03 2020 at 15:38):
i realized that both foldl
and foldr
use fold_map
, which uses traverse
, for which e.g. the list implementation is already not tail recursive if i'm not mistaken. at that point i decided that i probably don't understand the code well enough to comment.
Simon Hudon (Apr 03 2020 at 21:50):
Yes you're right list.traverse
is not tail recursive. For the instances of traverse
that are tail recursive, we're better off using foldl
. Otherwise, I don't think there's a downside.
Marc Huisinga (Apr 03 2020 at 23:18):
when traverse
is tail recursive, why would foldr
not be? (i.e. why would it be better to use foldl
?)
Jannis Limperg (Apr 06 2020 at 13:33):
Two follow-up questions:
- Would it be a good idea to move
foldr
,foldl
,length
etc. into thetraversable
type class? Haskell seems to do this quite liberally. Upside: efficiency gains (some asymptotic); downside: some more proof obligations in thelawful
classes. - Assuming that the
foldl
implementation oflength
runs in constant stack space, I imagine it still performs two traversals instead of one. Is that generally the right tradeoff in Lean (i.e. accepting an O(n) penalty to avoid non-tail recursion)?
Gabriel Ebner (Apr 06 2020 at 13:39):
Lean 3 doesn't do any tail-call optimization as far as I can tell, so there isn't really a stack space advantage.
Jannis Limperg (Apr 06 2020 at 14:43):
So concretely, if we take foldl
and foldr
on lists:
@[simp] def foldl (f : α → β → α) : α → list β → α | a [] := a | a (b :: l) := foldl (f a b) l @[simp] def foldr (f : α → β → β) (b : β) : list α → β | [] := b | (a :: l) := f a (foldr l)
These would both use O(n) stack space and should therefore be roughly equally fast (or slow)?
Gabriel Ebner (Apr 06 2020 at 14:48):
Yup, both use O(n) stack space and I would expect the performance to be similar. If you call trace_call_stack
, then you can convince yourself that both have linear stack usage:
variables {α β : Type} @[simp] def foldl (f : α → β → α) : α → list β → α | a [] := a | a (b :: l) := foldl (f a b) l @[simp] def foldr (f : α → β → β) (b : β) : list α → β | [] := b | (a :: l) := f a (foldr l) #eval foldl (λ a b : ℕ, if b = 9 then trace_call_stack 0 else 0) 0 (list.range 10) #eval foldr (λ a b : ℕ, if a = 9 then trace_call_stack 0 else 0) 0 (list.range 10)
Simon Hudon (Apr 06 2020 at 16:26):
That's true. Somehow, my spidy sense is still telling me that it's wrong to use foldr
here because the language is eager. But it doesn't make a difference
Reid Barton (Apr 06 2020 at 16:27):
presumably it will make a difference in Lean 4?
Simon Hudon (Apr 06 2020 at 16:28):
It will
Simon Hudon (Apr 06 2020 at 16:28):
With borrow inference, I think Lean 4 can now do tail call elimination
Mario Carneiro (Apr 06 2020 at 16:29):
Why can't lean 3 do TCO?
Mario Carneiro (Apr 06 2020 at 16:29):
I mean, there isn't anything stopping us from implementing it
Simon Hudon (Apr 06 2020 at 16:30):
As far as I can tell, it's because of the scoping of reference counting variables. Functions need to return to the calling context so that that one can decrease the ref count of its variables before it returns
Mario Carneiro (Apr 06 2020 at 17:31):
If you are doing a tail call, then you do that refcounting cleanup at the site of the tail-call (causing any variables that have refcount 1 to be dropped unless the caller is taking ownership)
Simon Hudon (Apr 06 2020 at 17:33):
Yes, that sounds like the right way to do it
Last updated: Dec 20 2023 at 11:08 UTC