Zulip Chat Archive

Stream: lean4

Topic: New compiler performance regression

Matej Penciak (Oct 20 2025 at 14:46):

It looks like the new compiler is hanging on a weird expression we're generating by elaborating a DSL we have. Here's a minimized example demonstrating the issue (it could probably be minimized further, but I'm running out of ideas how):

import Lean

open Lean

inductive Val (rep : Type) : Type where
| unit : Val rep
| array : (v : Vector rep n) → Val rep
| letIn : Val rep → (rep → Val rep) → Val rep
| call : (args : List rep) → Val rep

structure Function where
  body : List rep → Val rep

open Elab Command

abbrev DSLM := StateT Nat CommandElabM

def DSLM.run  (a : DSLM α) : CommandElabM α :=
  StateT.run' a 0

def nameOf : DSLM Lean.Ident :=
  modifyGet fun s =>
    (mkIdent (.mkSimple s!"#v_{s}"), s + 1)

def wrapInLet (e : TSyntax `term) (k : Option $ TSyntax `term → DSLM (TSyntax `term))
    : DSLM (TSyntax `term) := do
  let ident ← nameOf
  match k with
  | some k => do
    let rest ← k ident
    ``(Val.letIn $e fun $ident => $rest)
  | none => do
    pure e

def makeBlock (fuel : Nat) (k : Option $ TSyntax `term → DSLM (TSyntax `term))
  : DSLM (TSyntax `term) := do
  match fuel with
  | n + 1 => wrapInLet (←``(Val.unit)) $ some fun _ => makeBlock n k
  | 0 =>
    let numBinders ← get
    let names : Array $ TSyntax `term :=
      Array.range numBinders |>.map (fun i => ⟨mkIdent (.mkSimple s!"#v_{i}")⟩)
    let args ← `([$names,*])
    wrapInLet (←``(Val.call $args)) k

def makeWeirdExpr (fuel : Nat) : DSLM (TSyntax `term) := do
  makeBlock fuel none

set_option trace.profiler true

elab "#foo" : command => do
  let body ← DSLM.run $ makeWeirdExpr 150
  let lambda ← `(fun args => match args with | _ => $body)
  let decl ← `(def $(mkIdent `foo) : Function := Function.mk $lambda)
  elabCommand decl

#foo

#print foo

In 4.21 the command #foo was pretty much instant, but in toolchains >4.21 this particular command hangs for a significant amount of time, and if you increase the fuel in the elab definition further it will eventually fail to compile.

The output of the profiler seems to suggest the vast majority of the time spent in elaborating + compilation is in the dead code elimination pass, but I'm not sure if that's a red herring.

I'm happy to open an issue if this is something that should be looked at closer.

Henrik Böving (Oct 20 2025 at 16:16):

It's not stuck in the general dead code elimination pass but in some slightly smarter component that I can very much believe to be slow. Please do make an issue about it

Henrik Böving (Oct 20 2025 at 16:20):

https://share.firefox.dev/4qj6E4R slowness is in the widening of the analysis

Henrik Böving (Oct 20 2025 at 18:05):

The reason for the slowness is as follows: Your code generates a ton of nested lambdas with local variables which our analysis takes a look at. In particular this analysis uses a widening operator to speed up convergence for global function definition arguments but not for local function arguments etc. Crucially in the old compiler when elim dead branches ran we already did run lambda lifting so the widening operator speeds up convergence quite a lot. On the contrary in the new compiler we have not yet run lambda lifting and thus track the local function arguments in full precision.

To at least get similar behavior I'll try to teach the new compiler to just use widening for local function arguments as well.

Matej Penciak (Oct 20 2025 at 19:36):

Thanks for the explanation! I guess profiling of elimDeadBranches was indeed a bit of a red herring, but I'm glad it helped focus on the real issue.

I filed the issue at lean4#10857

Henrik Böving (Oct 20 2025 at 20:03):

No the issue is precisely in elimDeadBranches

Henrik Böving (Oct 20 2025 at 20:05):

It's just that elim dead branches is more than a syntactic dead code eliminator, that lives in a different component

Last updated: Feb 28 2026 at 14:05 UTC