Zulip Chat Archive

Stream: lean4

Topic: constructor reuse across types

Scott Kovach (Jan 20 2024 at 03:30):

Am I correct that the compiler doesn't reuse constructors if the allocation is for a different type?
here's an experiment on v4.5.0-rc1 with different list types:

inductive L1 | nil | cons : Nat → L1 → L1
inductive L2 | nil | cons : Nat → L2 → L2

def List.toL1 : List Nat → L1
| .nil => .nil
| .cons x xs => .cons x xs.toL1

set_option trace.compiler.ir.result true
def map1 : L1 → L1 :=
  let rec go1
  | x, .nil => x
  | x, .cons y xs => go1 (.cons y x) xs
  go1 .nil

def map2 : L1 → L2 :=
  let rec go2
  | x, .nil => x
  | x, .cons y xs => go2 (.cons y x) xs
  go2 .nil

you can see that the second one does not have a branch that reuses the constructor.

This is in the context of trying to make an FBIP RBNode traversal as in the Perceus paper; I found a hack to make it work without allocating, but preferably it would reuse constructors between the tree and the zipper.

if this is a desired feature, could anyone give a pointer towards how to get started implementing it?

Henrik Böving (Jan 20 2024 at 09:49):

The current implementation does not do this AFAIK. There are papers that explain how to do this and expand further on the current FBIP implementation by other groups.

That said the code generator/optimizer is currently not really open to feature or bug fixes unless really necessary, at least that was the policy so far. There is a bigger rewrite that will bring lots of improvements and bug fixes lined up in the future though.

Kyle Miller (Jan 21 2024 at 03:04):

Scott and I spent some time poking around the compiler and we found where the restriction occurs: https://github.com/leanprover/lean4/blob/master/src/Lean/Compiler/IR/ResetReuse.lean#L28

private def mayReuse (c₁ c₂ : CtorInfo) : Bool :=
  c₁.size == c₂.size && c₁.usize == c₂.usize && c₁.ssize == c₂.ssize &&
  /- The following condition is a heuristic.
     We don't want to reuse cells from different types even when they are compatible
     because it produces counterintuitive behavior. -/
  c₁.name.getPrefix == c₂.name.getPrefix

My understanding of the way reset/reuse insertion works is that

It looks for objects that are cased on.
Then it looks for the first constructor in each case arm that satisfies mayReuse, and this constructor is the one that can try to reuse the cased-on object. There are some considerations for variable liveness here too.

Except for the liveness computations, it's making a local decision about whether a given constructor is an ok candidate. Likely, if code is doing case analysis on a constructor, and a constructor of the same type is in a case arm, and it's no longer alive from this point onward, then that constructor is the one that can be reused, if any can be reused at all. And the reset/reuse insertion can only choose one candidate.

Joachim Breitner (Jan 21 2024 at 13:01):

I guess the “counterintuitive behavor” claim was refuted in this thread, where some intuitively assumed it would be re-used, and even have a use-case for it. So should could this just be dropped?
Or is there other surprising behavior than maybe “reused when not expected”?

Henrik Böving (Jan 21 2024 at 13:03):

I think one thing to be considered here is that minimal layout changes in a type can have drastic performance differences this way which would be a little counter intuitive to me without knowing about this precise behavior.

Last updated: May 02 2025 at 03:31 UTC