Zulip Chat Archive

Stream: new members

Topic: Stack overflow for long lists/arrays?


Ioannis Konstantoulas (Jan 15 2023 at 10:52):

Hello all; I want to create a list or array representing the integer interval [0,n) for large n. I tried the naive way:

def upto : Nat -> List Nat
  | 0 => []
  | Nat.succ k => upto k ++ [k]

but when I eval this with a large argument, I get a stack overflow and the Lean server crashes in VSCode.
For instance, #eval upto 10 correctly gives me the list I need, but if I #eval upto 100000, it crashes. The same is true if I replace List with Array in the code above.

Thanks!

Eric Wieser (Jan 15 2023 at 11:02):

Can you try #eval (upto 10).length instead?

Eric Wieser (Jan 15 2023 at 11:02):

It might be that it's the printing that is stack overflowing

Mario Carneiro (Jan 15 2023 at 11:06):

That function is not tail recursive, so this is expected behavior. (Even if it was you should watch out for printing code overflowing as Eric says.)

Ioannis Konstantoulas (Jan 15 2023 at 11:09):

Eric Wieser said:

Can you try #eval (upto 10).length instead?

Yes, the overflow persists when I change 10 to 100000. As Mario said, the function is not tail recursive as written, but I am not sure how to make it so.

Mario Carneiro (Jan 15 2023 at 11:10):

Here's a version that works:

def Array.upto (n : Nat) : Array Nat := go n 0 #[] where
  go
  | 0,   _, arr => arr
  | n+1, i, arr => go n (i+1) (arr.push i)

#eval (Array.upto 100000).size

Eric is right that this crashes if you leave off .size

Mario Carneiro (Jan 15 2023 at 11:10):

you can also do it with a for loop:

def Array.upto (n : Nat) : Array Nat := Id.run do
  let mut arr := #[]
  for i in [0:n] do
    arr := arr.push i
  pure arr

#eval (Array.upto 100000).size

Ioannis Konstantoulas (Jan 15 2023 at 11:15):

Mario Carneiro

Thank you very much! So, the function I wrote is not tail-recursive because it is mixing the ++ constructor and the recursive function call?

Mario Carneiro (Jan 15 2023 at 11:17):

the recursive function call needs to be the last thing your function does. So foo (n+1) := foo (bar n) is okay but foo (n+1) := bar (foo n) is not

Ioannis Konstantoulas (Jan 15 2023 at 11:18):

I see; thank you for this clarification!

Mario Carneiro (Jan 15 2023 at 11:19):

the reason is that if you call a function in tail position you can basically forget that you were in the context of a call to foo in the first place and just evaluate foo with the result of bar n, whereas in the bar (foo n) example you have to evaluate foo n while remembering that we are still pending a call to bar, and those calls to bar can pile up (that's what the call stack is storing)

Eric Wieser (Jan 15 2023 at 13:24):

Does (toString (repr (upTo 10000))).size also crash? Is there some recursion that needs to be turned into tail recursion in core somewhere?

Mario Carneiro (Jan 15 2023 at 13:28):

last I checked it was the format instance for lists

Mario Carneiro (Jan 15 2023 at 13:32):

Yes, that crashes. toString is the one that overflows

Mario Carneiro (Jan 15 2023 at 13:33):

which means it's probably the formatter itself. It is probably pretty difficult to make that one tail recursive, it already has some CPS stuff going on

Mario Carneiro (Jan 15 2023 at 13:37):

The result of repr on a large array/list looks like this:

deriving instance Repr for Std.Format.FlattenBehavior
deriving instance Repr for Std.Format
#eval repr (Array.mkArray 4 2)
Std.Format.group
  (Std.Format.nest
    2
    (Std.Format.append
      (Std.Format.append
        (Std.Format.text "#[")
        (Std.Format.append
          (Std.Format.append
            (Std.Format.append
              (Std.Format.append
                (Std.Format.append
                  (Std.Format.append (Std.Format.text "2") (Std.Format.append (Std.Format.text ",") (Std.Format.line)))
                  (Std.Format.text "2"))
                (Std.Format.append (Std.Format.text ",") (Std.Format.line)))
              (Std.Format.text "2"))
            (Std.Format.append (Std.Format.text ",") (Std.Format.line)))
          (Std.Format.text "2")))
      (Std.Format.text "]")))
  (Std.Format.FlattenBehavior.fill)

(imagine the tower of appends in the middle is repeated n times instead of just 4)

Mario Carneiro (Jan 15 2023 at 13:38):

It's pretty hard to process tree structures like this in tail-recursive style without an explicit stack

Eric Wieser (Jan 15 2023 at 13:50):

Do you think having a Format.sequence [a, b, c] constructor instead of docs4#Std.Format.append would help with that?

Mario Carneiro (Jan 15 2023 at 13:57):

Unclear. I haven't been able to pinpoint the actual culprit function - the formatter core is actually based on an explicit stack, so unless TCO is not working on it (which I think is unlikely) it is probably not responsible for the issue.

Mario Carneiro (Jan 15 2023 at 14:02):

debugger says the culprit is src4#Std.Format.spaceUpToLine

Eric Wieser (Jan 15 2023 at 14:04):

(deleted)

Mario Carneiro (Jan 15 2023 at 14:21):

One simple fix that makes the formatter work here is:

private partial def spaceUptoLine : Format  Bool  Int  Nat  SpaceResult
  | nil,          _,       _, _ => {}
  | line,         flatten, _, _ => if flatten then { space := 1 } else { foundLine := true }
  | align force,  flatten, m, w =>
    if flatten && !force then {}
    else if w < m then
      { space := (m - w).toNat }
    else
      { foundLine := true }
  | text s,       flatten, _, _ =>
    let p := s.posOf '\n'
    let off := s.offsetOfPos p
    { foundLine := p != s.endPos, foundFlattenedHardLine := flatten && p != s.endPos, space := off }
  | append (append f₁ f₂) f₃, flatten, m, w => spaceUptoLine (append f₁ (append f₂ f₃)) flatten m w -- NEW
  | append f₁ f₂, flatten, m, w => merge w (spaceUptoLine f₁ flatten m w) (spaceUptoLine f₂ flatten m)
  | nest n f,     flatten, m, w => spaceUptoLine f flatten (m - n) w
  | group f _,    _,       m, w => spaceUptoLine f true m w
  | tag _ f,      flatten, m, w => spaceUptoLine f flatten m w

Eric Wieser (Jan 15 2023 at 14:23):

What's the diff there?

Mario Carneiro (Jan 15 2023 at 14:23):

the line marked -- NEW

Eric Wieser (Jan 15 2023 at 14:24):

Ah, that was cut off on my screen

Mario Carneiro (Jan 15 2023 at 14:24):

also the partial at the front (which could be fixed by some minor refactoring of the signature)

Junyan Xu (Jan 15 2023 at 20:07):

upto is equal to docs4#List.range, right? How does this built-in function behave?

Damiano Testa (Jan 15 2023 at 22:08):

Also, do docs4#List.iota and docs4#List.iotaTR exist? And how do they fit here? They produce the list in reverse order and count to 1, but they may be good replacements, right?


Last updated: Dec 20 2023 at 11:08 UTC