Zulip Chat Archive

Stream: lean4

Topic: Lean compiler/elaborator development question


Tom (Sep 26 2024 at 16:29):

While perusing the Lean source code, I noticed that the array syntax arr[i]! is used quite a bit. It seems it's used in for loops, things like x.size.forM loops etc.
I thought that was unfortunate given Lean is a safe language and wanted to try to do something about it. I started by implementing something like

Data/Array/Iterate.lean

namespace Array
def indices {α : Type u} (a : Array α)  :...
...

which can subsequently be called as (e.g. in a test I added)

def sum (arr: Array Nat) : Id Nat := do
  let mut sum := 0
  for ii in arr.indices do    -- This now produces a sequence of `Fin arr.size`
    sum := sum + arr[ii]     -- ... so no []! required!
  pure sum

def x := #[1, 2, 3, 4, 5]

/-- info: 15 -/
#guard_msgs in
#eval sum x

Would something like this be useful/entertained if I turn it into a fully fledged PR? I have some ideas to make it fully fledged:

  • Make it generic via a type class, so e.g. both Array and List can use it
  • Support different start and end indices
  • Support co-iterating multiple arrays

As a separate question, I am still trying to figure out where I can actually use it - I'm guessing how Lean is bootstrapped. I read the "bootstrapping" document and rebuilt stage0 but I still must be misunderstanding something. For example, my new function can be used in Elab/GuessLex.lean to rewrite this

def formatTable : Array (Array String)  String := fun xss => Id.run do
  let mut colWidths := xss[0]!.map (fun _ => 0)
  for hi : i in [:xss.size] do
    for hj : j in [:xss[i].size] do
      if xss[i][j].length > colWidths[j]! then
        colWidths := colWidths.set! j xss[i][j].length

to this

  for i in xss.indices do
    for j in xss[i].indices do
      if xss[i][j].length > colWidths[j]! then
        colWidths := colWidths.set! j xss[i][j].length

but I don't seem to be able to do the same thing in e.g. Meta/FunInfo.lean. Any advice?

Kyle Miller (Sep 26 2024 at 18:02):

for i in indices arr do (or, generically, for i in indices arr do) could be interesting. It might need to go through an RFC process after some discussion here. It'd be good to validate that using indices doesn't incur any runtime overhead. How did you implement it?

Note that currently you can write

def sum (arr: Array Nat) : Id Nat := do
  let mut sum := 0
  for hi : i in [0:arr.size] do
    sum := sum + arr[i]
  pure sum

and of course for such a simple example you can write

def sum (arr: Array Nat) : Id Nat := do
  let mut sum := 0
  for x in arr do
    sum := sum + x
  pure sum

but I know that's not the point.

Tom (Sep 26 2024 at 18:43):

Hi @Kyle Miller

Thanks for the feedback. I am open to a discussion/RFC because I have a few ideas in this area but not sure how many of them should be in the Compiler/Elaborator vs e.g. in a/some library. I also didn't want to just introduce "code churn" by refactoring the Lean codebase because I imagine that may not be well received.

I created a draft PR against my own fork of Lean, in order not to clutter the main repo PR list: https://github.com/TomasPuverle/lean4/pull/1.

In short, the implementation is based on Data.Range's ForIter and comes out of my earlier experiments of trying to generalize Range itself to support other types than just Nat (i.e. turning the current Range into structure Range (t : Type)). It also suffers from the same "bug" with step that I reported last night.

I considered implementing ForIter in terms of the for hi: i in [...] but don't have a good grasp of the overheads involved/performance implications of that decision.

I am aware of the other two forms of for ... in, yes.

I briefly mentioned some additional ideas for improvements in the original message but as I said, I'd love to come up with an "extensible core" where you and others can help guide what should and should be part of the Lean package.

Tom (Sep 26 2024 at 18:46):

As a general observation though, do you know why Fin is not used more?

There are many other places where it seems "like the right thing" - e.g. even zipWithIndex for Array returns (a x Nat) rather than (a x Fin x.size). I'm guessing there are good reasons for this, so I would appreciate any insights to help me understand!

Tom (Sep 30 2024 at 22:39):

Another thought that just occurred to me is that since [] is just macro, perhaps the return type of Range could be Fin x, where x is the expression supplied for end. In which case it would "just work", and ince Fin coes to Nat, perhaps it wouldn't be too backward-incompatible?

Mario Carneiro (Sep 30 2024 at 22:48):

nope, that's very backward incompatible.

def foo : Nat := 1
def foo' : Fin 3 := 1

#eval foo + 5 -- 6
#eval foo' + 5 -- 0

Mario Carneiro (Sep 30 2024 at 22:49):

it's not that unusual to be doing arithmetic with the numbers yielded by a simple for loop of the form for i in [a:b] do

Mario Carneiro (Sep 30 2024 at 22:51):

I also don't think it's what you want most of the time, unless it is specifically a loop for i in [0:arr.size] do because you want to index into an array

Mario Carneiro (Sep 30 2024 at 22:51):

and for that you can use another iterator wrapper

Mario Carneiro (Sep 30 2024 at 22:52):

and IIRC for _h : i in [0:arr.size] is enough to make arr[i] indexing work

Tom (Oct 01 2024 at 04:06):

Good point about Fin!

I am (now) aware of the for h: but it was not obvious when I was just getting started.

As an aside, because I am not sure if that's the reason but perhaps because it's "longer" to type, it seems even the Lean source code itself still falls back on []! quite a bit. I am not sure if that's an indication of the fact the the syntax should be "the other way around".

Mario Carneiro (Oct 01 2024 at 08:58):

For a long time the automation associated with array indexing was ~nonexistent, so it might be worthwhile to see how many of those !'s can be removed without further code changes

Tom (Oct 01 2024 at 15:18):

FWIW, I did a bunch of cleanup late last night: PR

Joachim Breitner (Oct 01 2024 at 15:42):

Thanks for that, I merged that one because it’s just a nice feeling that the program will not do an array bounds check there anymore, even though it likely doesn’t matter on these code paths.

In general, though, I'd ask external contributors to not be too eager with “cleanup PRs”. Often they have very very little impact (e.g. no improved user experience), and there is a tangible cost in terms of reviewing them, of noisy git blame and possible conflicts with existing PRs.

Tom (Oct 01 2024 at 15:45):

Thanks @Joachim Breitner ! I will make sure to check in the future.


Last updated: May 02 2025 at 03:31 UTC