Zulip Chat Archive

Stream: lean4

Topic: 256 constructors is enough for anyone


Mario Carneiro (Jun 23 2023 at 05:51):

While messing with the olean dump tool I noticed that constructor indexes are only one byte, so I started to wonder what happens if you go past that. Turns out, nothing good...

import Lean
open Lean
macro "mk_foo" : command => do
  let l : Array (TSyntax `ident) :=
    ((List.range 257).map fun i => mkIdent (Name.mkSimple s!"x{i}")) |>.toArray
  `(set_option genInjectivity false in
    inductive $(mkIdent `Foo) where
      $[| $l:ident (_ : Nat)]*)

mk_foo

@[noinline] def blackBox := @id
#eval blackBox (Foo.x256 1) matches Foo.x0 1 -- true
#eval blackBox (Foo.x255 1) matches Foo.x0 1 -- unreachable_reached

(The limit is a bit less than 256 because the numbers 245 on up are reserved for various kinds of special object in the runtime.)

Mario Carneiro (Jun 23 2023 at 05:55):

Separately, this file is also very slow to build and generates a 5 MB olean, even after disabling genInjectivity.

Reid Barton (Jun 23 2023 at 08:20):

I remember some GHC bug report complaining that the compiler was too slow on an inductive type with like 5000 constructors, so 250 seems way too low.

Reid Barton (Jun 23 2023 at 08:23):

e.g. GHC itself has an inductive type with a constructor for each of its ~700 primops
https://hackage.haskell.org/package/ghc-9.6.1/docs/GHC-Builtin-PrimOps.html#t:PrimOp

Trebor Huang (Jun 23 2023 at 13:38):

I remember Coq had a similar bug where 256 constructors produces a proof of False

David Renshaw (Jun 23 2023 at 13:45):

In case anyone is wondering -- yes the wrong match still happens if genInjectivity is enabled:

import Lean
open Lean
macro "mk_foo" : command => do
  let l : Array (TSyntax `ident) :=
    ((List.range 257).map fun i => mkIdent (Name.mkSimple s!"x{i}")) |>.toArray
  `(inductive $(mkIdent `Foo) where
      $[| $l:ident (_ : Nat)]*)

set_option maxHeartbeats 1000000 in
mk_foo

@[noinline] def blackBox := @id
#eval blackBox (Foo.x256 1) matches Foo.x0 1 -- true

Mac Malone (Jun 23 2023 at 17:21):

Mario Carneiro said:

While messing with the olean dump tool I noticed that constructor indexes are only one byte, so I started to wonder what happens if you go past that. Turns out, nothing good...

Note that this is only true fi the inductive is not "enum-like" (i.e., has fields). Enum-like inductives are unboxed and will properly grow in size with their number of alternatives. However, really large inductives are still very slow to compile.

import Lean open Lean

macro "mk_enum" id:ident size:num : command => do
  let l : Array (TSyntax `ident) :=
    ((List.range size.getNat).map fun i => mkIdent (Name.mkSimple s!"x{i}")) |>.toArray
  `(set_option genInjectivity false in
    inductive $id where
      $[| $l:ident ]*)

mk_enum Foo 256
mk_enum Bar 255

set_option trace.compiler.ir.result true
def mkFoo : Foo := .x1 -- mkFoo : u16
def mkBar : Bar := .x1 -- mkBar : u8

Last updated: Dec 20 2023 at 11:08 UTC