Zulip Chat Archive

Stream: batteries

Topic: purpose of Array theorems

Julien Michel (Mar 08 2024 at 17:43):

Hello, I've been playing with Lean 4 for a while and would be interested in contributing to std.

I have a couple questions about List and Arrays:

Given that Array is just a wrapper on a List with a specific efficient implementation,

1) Why shouldn't List and Array have the same list of public defs in std / Core (with possibly different implementations, but equivalent behaviors) ?

2) Couldn't we just delete all Array theorems, keeping only equivalence proofs between Array and List defs, and focus on building a theory only about List's ?
Alternatively, if we want all Array theorems, couldn't we flag each List theorem with an attribute that automatically generates the corresponding Array theorem, provided that the required List and Array defs have proven equivalent behaviors?

Here is an example I made up to elaborate on my second point :

variable {α} [Inhabited α]

-- a List def
def List.get' : List α → Nat → α
  | [],  _ => panic "bad index"
  | a :: _,  0 => a
  | _ :: as, i + 1 => get' as i

-- its Array version with a different implementation
def Array.get' : Array α → Nat → α
  | ⟨[]⟩,  _ => panic "bad index"
  | ⟨a :: _⟩,  0 => a
  | ⟨_ :: as⟩, i + 1 => get' ⟨as⟩ i

-- proof that they are equivalent
@[simp] theorem Array.get_eq : ∀ (a : Array α) i, a.get' i = a.1.get' i := by
  intro ⟨l⟩ i
  induction l generalizing i with
  | nil => cases i <;> rfl
  | cons x as ih => match i with | 0 => rfl | i + 1 => apply ih

-- another List def
def List.set' : List α → Nat → α → List α
  | [],      _,     _ => []
  | _ :: as, 0,     x => x :: as
  | a :: as, n + 1, x => a :: set' as n x

-- its Array version with a different implementation
def Array.set' : Array α → Nat → α → Array α
  | ⟨[]⟩,      _,     _ => ⟨[]⟩
  | ⟨_ :: as⟩, 0,     x => ⟨x :: as⟩
  | ⟨a :: as⟩, n + 1, x => ⟨a :: (set' ⟨as⟩ n x).1⟩

-- proof that they are equivalent
@[simp] theorem Array.set_eq : ∀ (a : Array α) i x, a.set' i x = ⟨a.1.set' i x⟩ := by
  intro ⟨l⟩ i x
  induction l generalizing i with
  | nil => cases i <;> rfl
  | cons y as ih => match i with
    | 0 => rfl
    | i + 1 => apply congrArg (mk $ _ :: data .); apply ih

-- another List def
def List.length' : List α → Nat
  | []       => 0
  | _ :: as => length' as  + 1

-- its Array version with a different implementation
def Array.length' : Array α → Nat
  | ⟨[]⟩       => 0
  | ⟨_ :: as⟩ => length' ⟨as⟩ + 1

-- proof that they are equivalent
@[simp] theorem Array.length_eq : ∀ (a : Array α), a.length' = a.1.length' := by
  intro ⟨l⟩
  induction l with
  | nil => rfl
  | cons x as ih => unfold length'; rw [ih]; rfl


-- a List theorem about the 3 definitions
theorem List.set_get : ∀ (l : List α) i x, i < l.length' → (l.set' i x).get' i = x := by
  intro l i x h
  induction l generalizing i with
  | nil => contradiction
  | cons y as ih =>
    match i with
    | 0 => rfl
    | i + 1 => apply ih; apply Nat.lt_of_succ_lt_succ; assumption

-- the corresponding Array theorem, which could be auto generated from its List counterpart
theorem Array.set_get : ∀ (a : Array α) i x, i < a.length' → (a.set' i x).get' i = x := by
  intro ⟨l⟩; revert l; simpa using List.set_get -- proof is trivial with the right simp lemmas

Thanks for your clarifications or thoughts.

Joe Hendrix (Mar 08 2024 at 20:05):

Hi Julien,

I agree that List and Array should have consistent interfaces and normal forms. There are a lot of places in Lean where there are multiple operators or types and the lemma/simp normal forms should be more consistent. In addition to being consistent across types, I'd like to ensure that normal forms for operations defined in Lean itself are consistent across Lean, Std and Mathlib.

The idea of synthesizing lemmas is also done in the to_additive Mathlib attribute.

The approach I am currently working on is to build testing infrastructure to validate normal forms are consistent in different environments. Lean4#3508 includes the first protype in tests/playground for testing Bool normal forms, but as I get time to polish and document the generator, I will try to find a more permanent home for it.

I am currently working on Nat and Int, but would like to tackle List and Array at some point. Test cases and lemmas that find places to improve normal forms for Array and List would be welcome, but I'm not sure that the to_additive approach is the right fit. There may be exceptions and I've personally found that tactic difficult to debug when things don't work.

There's also a newish #check_simp command in Lean now that let's one easily test normal forms.

James Gallicchio (Mar 08 2024 at 20:25):

See this thread for some discussion about making abstractions for collections. LeanColls is starting to actually become useful for me personally, so maybe it will be useful for you as well.

James Gallicchio (Mar 08 2024 at 20:28):

RE: arrays specifically, I think a lot of the difficulty comes from the fact that often we want to treat an array NOT as a list, but as a function Fin (size A) -> ElemType.

IMO this should be resolved by separating these two views into an array, so that they can have different simp normal forms and work with different automation. This is one of the things LeanColls does (arrays are Seq, or list-like, but you can Seq.fixSize them to get an indexed collection that has some cleaner lemmas around get and set in particular)

Last updated: May 02 2025 at 03:31 UTC