Zulip Chat Archive
Stream: general
Topic: universe hackery
Kevin Buzzard (May 03 2023 at 09:41):
(From user BeLazy
on the Discord, thought it might amuse, this is Lean 4):
#check Type 37 -- maximum universe level offset threshold (32) has been reached,
#check Type (32 + 5) -- `Type 37 : Type 38`
def Lift100.{u} (A: Type u):= ULift.{u+32+32+32+4,u} A
def Lift1000 := Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘
Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100
#check Lift1000 Type
-- Lift1000 Type : Type 1001
Reid Barton (May 03 2023 at 09:49):
Will it overflow?
Reid Barton (May 03 2023 at 09:50):
I guess by defining Lift10000
, etc. you can get the universe levels exponentially high but maybe Lean will also take exponentially long to type check them
Arthur Adjedj (May 03 2023 at 09:55):
This crashes on my PC:
def Lift100.{u} (A: Type u):= ULift.{u+32+32+32+4,u} A
def Lift1000 := Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100 ∘ Lift100
def Lift10000 := Lift1000 ∘ Lift1000 ∘ Lift1000 ∘ Lift1000 ∘ Lift1000 ∘ Lift1000 ∘ Lift1000 ∘ Lift1000 ∘ Lift1000 ∘ Lift1000
def Lift100000 := Lift10000 ∘ Lift10000 ∘ Lift10000 ∘ Lift10000 ∘ Lift10000 ∘ Lift10000 ∘ Lift10000 ∘ Lift10000 ∘ Lift10000 ∘ Lift10000
def Lift1000000 :=
Lift100000 ∘
Lift100000 ∘
Lift100000 ∘
Lift100000 ∘
Lift100000 ∘
Lift100000 ∘
Lift100000 ∘
Lift100000 ∘
Lift100000 ∘
Lift100000
Arthur Adjedj (May 03 2023 at 09:56):
The crash is likely due to the fact that such universes are encoded as unary successors, and don't get the same kind of low-level implementation/optimisation that Nat
s and Int
s do.
Arthur Adjedj (May 03 2023 at 10:10):
I encountered similar issues when implementing our universe system for Proost, the solution we landed on was to replace the succ : Level → Level
constructor to a plus : Level → Nat → Level
constructor. This would be both more efficient in memory and should give a faster type-checking algorithm, at least if the one currently implemented in Lean 4 still resembles the one described in @Mario Carneiro 's thesis.
Mario Carneiro (May 03 2023 at 10:11):
I think lean itself does this
Mario Carneiro (May 03 2023 at 10:12):
There is a hard coded limit on universe level sizes, I thought it was much less than 100000 though
Mario Carneiro (May 03 2023 at 10:13):
it has a snarky message about misusing universes
Mario Carneiro (May 03 2023 at 10:13):
ah right, #check Type 64
shows this
Arthur Adjedj (May 03 2023 at 10:14):
Mario Carneiro said:
I think lean itself does this
perhaps the "low-level" implementation of levels do, but in the Lean code at least, Levels are encoded as such:
inductive Level where
| zero : Level
| succ : Level → Level
| max : Level → Level → Level
| imax : Level → Level → Level
| param : Name → Level
| mvar : LMVarId → Level
Arthur Adjedj (May 03 2023 at 10:15):
Mario Carneiro said:
ah right,
#check Type 64
shows this
It can be circumvented by adding things together.#check Type (32+32)
doesn't raise any issue
Mario Carneiro (May 03 2023 at 10:15):
there is another limit beyond this, I forget where it is
Mario Carneiro (May 03 2023 at 10:16):
I'm thinking of the to_offset()
function in level.cpp
BTW, the syntax uses naive succ but it is converted to a (level, offset) pair for normalization
Mario Carneiro (May 03 2023 at 10:18):
and BTW to_offset
uses an unsigned
with no overflow checks so if you can make it to 2^32 without a crash you might get a soundness bug out of it
Arthur Adjedj (May 03 2023 at 10:18):
oooooh, OMW to do so then
Mario Carneiro (May 03 2023 at 10:19):
it's probably not possible without overflowing the stack, because you will have to create a very deep Level
term
Arthur Adjedj (May 03 2023 at 10:19):
lean seems to stack overflow before that though, yeah
Mario Carneiro (May 03 2023 at 10:20):
maybe if you raise the stack limit...?
Mario Carneiro (May 03 2023 at 10:20):
I don't know what functions are going to get hit hardest
Mario Carneiro (May 03 2023 at 10:21):
you will need (2^32 * stack frame of recursive function) bytes of memory, which is a lot but might be feasible
Arthur Adjedj (May 03 2023 at 10:22):
I fear that the lift composition trick up here might give the elaboration a hard time unifying that many universes, so simply doing Type 32 +........ +32
might be wiser here. Do you know of any easy way to raise the stack limit ?
Mario Carneiro (May 03 2023 at 10:23):
Use addDecl
and just hand the level directly to the kernel
Mario Carneiro (May 03 2023 at 10:25):
aha, here's the hidden limit: https://github.com/leanprover/lean4/blob/master/src/Lean/Level.lean#L46
Mario Carneiro (May 03 2023 at 10:28):
Even just using Level.succ
enough times should trigger that
Arthur Adjedj (May 03 2023 at 10:30):
So the error would be raised as soon as the computed field would get computed first. This seems hard to avoid
Mario Carneiro (May 03 2023 at 10:32):
yeah I would say that this is not a soundness bug
Mario Carneiro (May 03 2023 at 10:33):
although it still seems difficult to hit this using user code, since all the elaboration tricks overflow
Arthur Adjedj (May 03 2023 at 10:34):
Wouldn't the kdef
trick still work here to send all that to the kernel without checking any of this during elaboration ?
Mario Carneiro (May 03 2023 at 10:34):
No, because you would need to construct a Level
to pass to the kernel and you can't even do that
Mario Carneiro (May 03 2023 at 10:35):
basically, there is a bounds check, it is just not in the kernel but in the level constructor
Mario Carneiro (May 03 2023 at 10:35):
(which is part of the kernel in a sense)
Arthur Adjedj (May 03 2023 at 10:36):
Aren't computed fields lazily computed ? As long as no function calling for Level.data
is used to generate the kdef, why wouldn't this work ?
Mario Carneiro (May 03 2023 at 10:36):
no
Arthur Adjedj (May 03 2023 at 10:36):
oh, my bad then
Mario Carneiro (May 03 2023 at 10:37):
computed fields are extra fields stored with the data and eagerly computed on each constructor application
Mario Carneiro (May 03 2023 at 10:38):
before computed fields all these constructors had an additional argument for the data field and you had to use smart constructors to build the values
Arthur Adjedj (May 03 2023 at 10:38):
My belief was that, while stored with the data, they were still computed "on demand", similar to doing OnceCell
headers in Rust
Mario Carneiro (May 03 2023 at 10:38):
but this is also sketchy since you could just not use the smart constructor
Mario Carneiro (May 03 2023 at 10:38):
it's closer to e.g. the len
field on a HashMap
Arthur Adjedj (May 03 2023 at 10:39):
The more you know ! Thanks for the insight :)
Mario Carneiro (May 03 2023 at 10:39):
the fix for that is to add an equality hypothesis BTW, but lean core doesn't use a lot of dependent types
Last updated: Dec 20 2023 at 11:08 UTC