Zulip Chat Archive
Stream: lean4
Topic: new compiler doesn't `always_inline`
Alex Meiburg (Jul 04 2025 at 17:24):
I'm having a hard time getting a mwe, but it seems as though the new compiler doesn't properly respect the always_inline attribute. For a particular function, I get the following compiled output:
[saveBase] size: 20
def IsComputable._example._nativeDecide_1 : Bool :=
let _x.1 := instDecidableTrue;
let _x.2 := Real.instInv;
let _x.3 := instComputableOfNat1;
let _x.4 := _x.2 # 0;
let _x.5 := 100;
let _x.6 := Nat.cast._at_.IsComputable._example._nativeDecide_1.spec_0 _x.5;
let _x.7 := _x.4 _x.6;
let _x.8 := Real.one.0;
let _x.9 := instComputableOfNatAtLeastTwo _x.5 ◾;
let _x.10 := instComputableInv _x.6 _x.9;
let _x.11 := @instDecidableLT _x.7 _x.8 _x.10 _x.3;
cases _x.11 : Bool
| Decidable.isFalse h.12 =>
let _x.13 := false;
return _x.13
| Decidable.isTrue h.14 =>
cases _x.1 : Bool
| Decidable.isFalse h.15 =>
let _x.16 := false;
return _x.16
| Decidable.isTrue h.17 =>
let _x.18 := true;
return _x.18
Here @instDecidableLT is appearing in the output even though I have
@[always_inline]
instance instDecidableLT ....
earlier in the file.
Alex Meiburg (Jul 04 2025 at 17:25):
This cropped up because I'm getting the result to be noncomputable, but it will become computable when it realizes that certain arguments can be dropped; but it's not dropping those arguments because it's not inlining enough.
Alternately, it's not switching to the _redArg versions of functions as much as it could be, which would also work of course.
Cameron Zwarich (Jul 04 2025 at 17:43):
Do you have a non-minimal WE you could share? Size doesn’t really matter for something like this.
Alex Meiburg (Jul 05 2025 at 22:50):
Alright, here's a somewhat minimized version. (It was from a larger project.)
not_really_mwe.lean
Cameron Zwarich (Jul 06 2025 at 03:00):
Thanks! I think I understand why this is happening now. The compiler has two basic phases: base and mono. In the new compiler, we don't inline plain instance-producing calls (as opposed to projections from instances) in the base phase, only the mono phase. However, the noncomputable check occurs at the boundary between the two phases, and this unfortunate combination defeats your attempt of replacing noncomputable things with computable things via type classes.
I guess we have 3 options here:
- Allow inlining for all instances in the base phase.
- Allow inlining for
Decidableinstances in the base phase (which at least seems motivated by the idea of inlining them before thenoncomputablecheck). - Use a simple relevance/dependency analysis for the
noncomputablecheck, which would detect thatxandyaren't even used by the instance and thus can't contribute to its user beingnoncomputable. There are other motivations for this idea, but your test case provides yet another good one.
I'm curious about the impact of trying 1 or 2, so I'll probably run benchmarks to see what the fallout is (if it doesn't somehow cause some test failures), but without knowing more I think only 2 would be considered in practice. I am pretty sure 3 is coming eventually.
Cameron Zwarich (Jul 06 2025 at 05:44):
I thought of a way to do a mini version of 3 (using information already computed by an existing pass), implemented it, and it seems to be working. I’ll try to clean it up and land it tomorrow.
Alex Meiburg (Jul 06 2025 at 15:24):
Cool, okay! I see the _redArg replacement happening, which I assume is for replacing functions with a reduced-argument version when some of the arguments are unused. Is that the "already computed by an existing pass"? Is there a reason why the instance isn't replaced by a _redArg version here?
Cameron Zwarich (Jul 06 2025 at 15:45):
Alex Meiburg said:
Cool, okay! I see the
_redArgreplacement happening, which I assume is for replacing functions with a reduced-argument version when some of the arguments are unused. Is that the "already computed by an existing pass"? Is there a reason why the instance isn't replaced by a _redArg version here?
Yeah, that’s the pass I meant. The ._redArg replacement isn’t occurring because the inlining of the rewritten plain decl with the call to the ._redArg version occurs right after the boundary between the base and mono phases. I guess I could special-case the replacement here, although that means dealing with under/over-applied functions.
The idea I had was to make the reduceArity pass rewrite the type of original decl to replace unused parameters with an erased type, and then have toMono consider arguments flowing into erased parameters to be irrelevant. If constructor mono types are also changed so that parameters are erased, then this would unify the handling of ordinary applications and constructor applications.
Alex Meiburg (Jul 06 2025 at 16:34):
Neat. I can't say I followed 100% of that, but I guess I'm looking forward to hearing about it! :)
Cameron Zwarich (Jul 11 2025 at 00:50):
After doing the ground work for this patch in toMono, I wrote the last bit for reduceArity as lean4#9226. Unfortunately, it has an unexpected memory usage regression that I have verified locally. I'll need to debug what's going on here. We definitely want this change for other reasons, e.g. lean4#9021.
Cameron Zwarich (Aug 22 2025 at 01:10):
I did finally debug the issue and it revealed some issues with the approach and the interface between erased irrelevant terms and polymorphism. I was a bit nervous about changing this, and all of my attempts caused large code size regressions (due to more functions needing _boxed variants), so I abandoned it.
I just decided to go back to the grog-brained approach of just replacing calls to decls with calls to their _redArg versions in toMono, so that the useless args are considered for the purposes of the noncomputable check. This should be getting merged right now in #10040.
Sorry about the delay! This would have been quicker if the original approach worked out.
Last updated: Dec 20 2025 at 21:32 UTC