Zulip Chat Archive

Stream: general

Topic: Implementation of category.fold.to_list


Jannis Limperg (Apr 03 2020 at 11:02):

In category.fold, to_list is defined as

def to_list : t α  list α :=
list.reverse  foldl (flip list.cons) []

where t is an arbitrary traversable. However, I would have expected foldr list.cons [] instead. Is there a reason to prefer the first implementation? (@Simon Hudon I believe this is your area of expertise.)

Marc Huisinga (Apr 03 2020 at 11:06):

(deleted)

Marc Huisinga (Apr 03 2020 at 11:12):

actually, scratch that, sorry.

Reid Barton (Apr 03 2020 at 14:12):

My guess is this one runs in fixed stack space in a strict language

Reid Barton (Apr 03 2020 at 14:12):

now I see it's approximately what Marc wrote and then deleted, so maybe I am wrong too :upside_down:

Simon Hudon (Apr 03 2020 at 15:10):

That was actually my reason. Why did @Marc Huisinga abandon the explanation?

Marc Huisinga (Apr 03 2020 at 15:38):

i realized that both foldl and foldr use fold_map, which uses traverse, for which e.g. the list implementation is already not tail recursive if i'm not mistaken. at that point i decided that i probably don't understand the code well enough to comment.

Simon Hudon (Apr 03 2020 at 21:50):

Yes you're right list.traverse is not tail recursive. For the instances of traverse that are tail recursive, we're better off using foldl. Otherwise, I don't think there's a downside.

Marc Huisinga (Apr 03 2020 at 23:18):

when traverse is tail recursive, why would foldr not be? (i.e. why would it be better to use foldl?)

Jannis Limperg (Apr 06 2020 at 13:33):

Two follow-up questions:

  1. Would it be a good idea to move foldr, foldl, length etc. into the traversable type class? Haskell seems to do this quite liberally. Upside: efficiency gains (some asymptotic); downside: some more proof obligations in the lawful classes.
  2. Assuming that the foldl implementation of length runs in constant stack space, I imagine it still performs two traversals instead of one. Is that generally the right tradeoff in Lean (i.e. accepting an O(n) penalty to avoid non-tail recursion)?

Gabriel Ebner (Apr 06 2020 at 13:39):

Lean 3 doesn't do any tail-call optimization as far as I can tell, so there isn't really a stack space advantage.

Jannis Limperg (Apr 06 2020 at 14:43):

So concretely, if we take foldl and foldr on lists:

@[simp] def foldl (f : α  β  α) : α  list β  α
| a []       := a
| a (b :: l) := foldl (f a b) l

@[simp] def foldr (f : α  β  β) (b : β) : list α  β
| []       := b
| (a :: l) := f a (foldr l)

These would both use O(n) stack space and should therefore be roughly equally fast (or slow)?

Gabriel Ebner (Apr 06 2020 at 14:48):

Yup, both use O(n) stack space and I would expect the performance to be similar. If you call trace_call_stack, then you can convince yourself that both have linear stack usage:

variables {α β : Type}

@[simp] def foldl (f : α  β  α) : α  list β  α
| a []       := a
| a (b :: l) := foldl (f a b) l

@[simp] def foldr (f : α  β  β) (b : β) : list α  β
| []       := b
| (a :: l) := f a (foldr l)

#eval foldl (λ a b : , if b = 9 then trace_call_stack 0 else 0) 0 (list.range 10)
#eval foldr (λ a b : , if a = 9 then trace_call_stack 0 else 0) 0 (list.range 10)

Simon Hudon (Apr 06 2020 at 16:26):

That's true. Somehow, my spidy sense is still telling me that it's wrong to use foldr here because the language is eager. But it doesn't make a difference

Reid Barton (Apr 06 2020 at 16:27):

presumably it will make a difference in Lean 4?

Simon Hudon (Apr 06 2020 at 16:28):

It will

Simon Hudon (Apr 06 2020 at 16:28):

With borrow inference, I think Lean 4 can now do tail call elimination

Mario Carneiro (Apr 06 2020 at 16:29):

Why can't lean 3 do TCO?

Mario Carneiro (Apr 06 2020 at 16:29):

I mean, there isn't anything stopping us from implementing it

Simon Hudon (Apr 06 2020 at 16:30):

As far as I can tell, it's because of the scoping of reference counting variables. Functions need to return to the calling context so that that one can decrease the ref count of its variables before it returns

Mario Carneiro (Apr 06 2020 at 17:31):

If you are doing a tail call, then you do that refcounting cleanup at the site of the tail-call (causing any variables that have refcount 1 to be dropped unless the caller is taking ownership)

Simon Hudon (Apr 06 2020 at 17:33):

Yes, that sounds like the right way to do it


Last updated: Dec 20 2023 at 11:08 UTC