Zulip Chat Archive

Stream: batteries

Topic: Nonempty List Type

François G. Dorais (Jan 07 2026 at 04:15):

There is a recurrent need for List1, the nonempty list type. It still doesn't exist. The main issue is that there are several alternatives.

0: Status Quo. Nonempty lists can be modeled (a) as a subtype { l : List α // l ≠ [] } or (b) as a product α × List α. In practice, these are often handled (a) as (l : List α ) (h : l ≠ []) or (b) as (head : α) (tail : List α) in function definitions and theorems.

1: Weak Type. The List1 α type could an abbreviation.
As (a)

abbrev List1 (α : Type _) := { l : List α // l ≠ [] }

Or (b)

abbrev List1 (α : Type _) := α × List α

2: Strong Type. The List1 α type could be a structure.
As (a)

structure List1 (α : Type _) where
  toList : List α
  toList_ne_nil : toList ≠ []

Or (b)

structure List1 (α : Type _) where
  head : α
  tail : List α

I've experimented with all these options with no clear winner. If you have experimented with these or otherwise have an opinion then please share your thoughts.

Eric Wieser (Jan 07 2026 at 08:36):

Could you elaborate on where the need arises?

Bhavik Mehta (Jan 07 2026 at 15:48):

My vote is for 2b for the simple reason that it matches what Haskell does

François G. Dorais (Jan 07 2026 at 19:44):

Experimental PR for 2b at batteries#1607. Comments and suggestions welcome!

cmlsharp (Jan 07 2026 at 19:51):

Are there concrete cases where the need for this has come up? That might better inform which is most ergonomic.

suhr (Jan 07 2026 at 19:58):

Option 2b results in code duplication, since you can't just reuse functions defined for List.

Jireh Loreaux (Jan 07 2026 at 20:01):

@Eric Wieser how about when you want to take the product of some elements in a semigroup.

suhr (Jan 07 2026 at 20:02):

Bhavik Mehta said:

My vote is for 2b for the simple reason that it matches what Haskell does

And Haskell uses it because it can't express list ≠ [].

François G. Dorais (Jan 07 2026 at 20:08):

My reason for bringing this up is this PR batteries#1595. (This is from a new contributor, please be nice!) The author used the type Σ n, Vector α (1 + n) where List1 α would be better, in whatever form.

François G. Dorais (Jan 07 2026 at 20:15):

suhr said:

Option 2b results in code duplication, since you can't just reuse functions defined for List.

Please see the experimental PR batteries#1607 and let me know what list functions aren't reusable.

Robin Arnez (Jan 07 2026 at 20:23):

For a comparison of both approaches:
Here is 2a in its most basic form

structure List1 (α : Type u) where
  toList : List α
  toList_ne_nil : toList ≠ []

@[match_pattern]
def List1.cons (head : α) (tail : List α) : List1 α :=
  ⟨head :: tail, nofun⟩

def List1.head (self : List1 α) : α :=
  self.toList.head self.toList_ne_nil

def List1.tail (self : List1 α) : List α :=
  self.toList.tail

theorem List1.toList_cons : (cons h t : List α) = h :: t := rfl
theorem List1.head_cons : (cons h t).head = h := rfl
theorem List1.tail_cons : (cons h t).tail = t := rfl
theorem List1.cons_head_tail : cons l.head l.tail = l :=
  match l with
  | cons h t => rfl

theorem List1.toList_mk : (mk l h : List α) = l := rfl
theorem List1.head_mk : (mk l h).head = l.head h := rfl
theorem List1.tail_mk : (mk l h).tail = l.tail := rfl
theorem List1.mk_toList : mk l.toList l.toList_ne_nil = l := rfl

Robin Arnez (Jan 07 2026 at 20:23):

and (2b)

structure List1 (α : Type u) where
  cons ::
  head : α
  tail : List α

def List1.mk (toList : List α) (toList_ne_nil : toList ≠ []) : List1 α :=
  cons (toList.head toList_ne_nil) toList.tail

def List1.toList (self : List1 α) : List α :=
  self.head :: self.tail

theorem List1.toList_ne_nil (self : List1 α) : self.toList ≠ [] := nofun

theorem List1.toList_cons : (cons h t).toList = h :: t := rfl
theorem List1.head_cons : (cons h t).head = h := rfl
theorem List1.tail_cons : (cons h t).tail = t := rfl
theorem List1.cons_head_tail : cons l.head l.tail = l := rfl

theorem List1.toList_mk : (mk l h).toList = l :=
  match l with
  | _ :: _ => rfl
theorem List1.head_mk : (mk l h).head = l.head h := rfl
theorem List1.tail_mk : (mk l h).tail = l.tail := rfl
theorem List1.mk_toList : mk l.toList l.toList_ne_nil = l := rfl

Robin Arnez (Jan 07 2026 at 20:24):

The main difference is that 2a preserves the defeq List1.toList_mk while 2b preserves the defeq List1.cons_head_tail.

Robin Arnez (Jan 07 2026 at 20:27):

My first thought is that it'd be more important to preserve the projection defeq List1.toList_mk than it is to preserve the eta defeq List1.cons_head_tail.

François G. Dorais (Jan 07 2026 at 20:27):

@Robin Arnez Please submit an alternative experimental PR on Batteries.

Robin Arnez (Jan 07 2026 at 20:28):

Alright

Eric Wieser (Jan 07 2026 at 20:46):

Jireh Loreaux said:

Eric Wieser how about when you want to take the product of some elements in a semigroup.

We already solve this in a lattice context with docs#Finset.sup'

Eric Wieser (Jan 07 2026 at 20:48):

François G. Dorais said:

My reason for bringing this up is this PR batteries#1595. (This is from a new contributor, please be nice!) The author used the type Σ n, Vector α (1 + n) where List1 α would be better, in whatever form.

In that PR I think it should either return { l : List α // l ≠ [] } or just List α, perhaps with some SatisfiesM result indicating it is non-nil

François G. Dorais (Jan 07 2026 at 20:56):

@Eric Wieser This is option 0a (compatible with 1a). Do you have a rationale why there is no need for a List1 API?

Eric Wieser (Jan 07 2026 at 21:02):

My claim is merely that batteries#1595 seems a long way from an application for it

Robin Arnez (Jan 07 2026 at 21:11):

batteries#1609

François G. Dorais (Jan 07 2026 at 21:19):

Eric Wieser said:

My claim is merely that batteries#1595 seems a long way from an application for it

I disagree. Had List1 been available, the author would have used it. Who knows how many users out there are using inefficient List1 implementations?!

Kim Morrison (Jan 07 2026 at 21:45):

Could we please rename whatever is implemented here to NonEmptyList? I wouldn't know what List1 is meant to mean.

François G. Dorais (Jan 07 2026 at 21:54):

I agree but the many/many1 paradigm is pretty deeply enshrined.

Eric Wieser (Jan 07 2026 at 21:56):

François G. Dorais said:

I disagree. Had List1 been available, the author would have used it. Who knows how many users out there are using inefficient List1 implementations?!

But the author isn't using any API other than the definition itself!

cmlsharp (Jan 07 2026 at 22:04):

There is some precedent for the 'List1' name, this is what Idris calls it at least, but I agree a more evocative name would probably be better

François G. Dorais (Jan 07 2026 at 22:06):

Eric Wieser said:

But the author isn't using any API other than the definition itself!

I don't understand. The point is that having a named type indicates the preferred implementation. (IMHO: if you have a type name then you have an API, possibly a trivial one.)

François G. Dorais (Jan 07 2026 at 22:16):

FWIW I think you're arguing for either 0a or 1a.

Eric Wieser (Jan 07 2026 at 22:18):

Yes, I think for the purpose of that PR alone, 0a is the right choice, and 1a is harmless enough if perhaps overkill. I would argue that we should compile a list of more compelling use-cases before deciding to write any API functions that consume List1

François G. Dorais (Jan 07 2026 at 22:21):

I think it would be very hard to compile current uses of 0a and 0b.

suhr (Jan 07 2026 at 22:27):

Robin Arnez said:

The main difference is that 2a preserves the defeq List1.toList_mk while 2b preserves the defeq

And I think if you implement Coe, then you basically get some properties for free. Like, (List1.cons x xs) ++ ys becomes (cons x xs).toList ++ ys, which is definitionally equal to x :: (xs ++ ys).

suhr (Jan 07 2026 at 22:37):

Though, this particular example works for both versions. But also List1.mk xs h ++ ys could coerce to (List1.mk xs h).toList ++ ys, which is definitionally equal to xs ++ ys.

Eric Wieser (Jan 07 2026 at 22:40):

I'm confused, aren't you talking about two things with different types being defeq? Or is there an implicit coercion there?

suhr (Jan 07 2026 at 22:41):

I'm talking about having an implicit coercion from List1 to List.

Eric Wieser (Jan 07 2026 at 22:44):

For the sake of clarity, can you edit your message to make the coercion placement explicit?

François G. Dorais (Jan 07 2026 at 22:47):

For anyone just joining, there are two experimental implementations of 2a and 2b available on Batteries:

2a is at batteries#1609 (author Robin Arnez)
2b is at batteries#1607 (author François G. Dorais)

These are intended for anyone interested to experiment. Contributions are welcome for both!

François G. Dorais (Jan 07 2026 at 22:51):

Obviously, at most one will be merged. But actual implementations are helpful to make a sound decision.

François G. Dorais (Jan 07 2026 at 22:58):

Note that implementations for 1a and 1b are not necessary since the above can easily be adapted. The main point is to compare the a and b approaches.

Miyahara Kō (Jan 08 2026 at 03:40):

By the way, Mathlib has List1-like design already: docs#Stream'.Seq and docs#Stream'.Seq1

Owen Shepherd (Jan 11 2026 at 15:43):

This is from a new contributor, please be nice!

Hello! That new contributor is me :)
I come from Haskell, and I love NonEmptyList, but I'm actually wondering why there's a need for a non-empty list, over a more general length-indexed list type.

The reason returning a vector isn't ideal, in my case, was because I don't know the size of the vector ahead of time, so I had to create one from a list.

But if I had a length-indexed list, I wouldn't have this problem.
I would think that a length-indexed list would be strictly more powerful than a nonempty list?
You'd be able to express that your list has at least two elements, for example...

Maybe this is a stupid question, I don't know.

Yury G. Kudryashov (Jan 11 2026 at 15:46):

BTW, we have docs#FreeSemigroup in Mathlib.

Jakub Nowak (Jan 11 2026 at 17:01):

Isn't List1 kinda anti-pattern? IIRC we prefer unbundled version (i.e. two separate arguments/fields l : List α and l ≠ [] when possible, and in case of function return type we return List α and prove non-emptiness in a separate theorem). Is there a need for bundled version? @Eric Wieser suggests we don't need bundled version for batteries#1607, as we can prove non-emptiness in a separate theorem using SatisfiesM.
PS: Alternatively α and List α is also being used as a unbundled version of non-empty list.

Eric Wieser (Jan 11 2026 at 17:27):

I think it's worth remembering that the context of this discussion is:

partial def many (p : f α) : f (List α) := _
def many1 (p : f α) : f (Σ n, Vector α (1 + n)) := /-- something complex -/

where the suggested alternatives are:

def many1 (p : f α) : f (List α) := List.cons <$> p <*> many p , which matches docs#Lean.Parser.many1 and docs#Std.Internal.Parsec.many1
a. optionally with a SatisfiesM theorem
def many1 (p : f α) : f {l : List α // l ≠ []} := /- the above with some proof -/
def many1 (p : f α) : f (α × List α) := Prod.mk <$> p <*> many p

Given that many is partial anyway, I don't think it makes much sense to spend a lot of time designing an API for the return type of many1. Do we have a case where option 1 (without 1a) is not sufficient?

Eric Wieser (Jan 11 2026 at 17:29):

The main use of having the length seems to be to make a termination argument, but it seems highly likely you'd just mark the caller as partial anyway given that many is

Owen Shepherd (Jan 11 2026 at 17:50):

Eric Wieser said:

The main use of having the length seems to be to make a termination argument

Maybe also to be able to get the head of the result without introducing an error codepath?
That's generally the reason to use Data.List.NonEmpty in Haskell.
I guess the sensible thing to do would be to PR many, and leave many1 as a follow-up...

Is there something I can read somewhere on why {l : List α // l ≠ []} is preferred over (Σ n, IList α (1 + n)) (apart from the latter not existing, obviously)?

Eric Wieser (Jan 11 2026 at 17:51):

By IList α (1 + n) I assume you mean docs#List.Vector α (1 + n)?

Eric Wieser (Jan 11 2026 at 17:53):

(Σ n, List.Vector α (1 + n)) reads somewhat as "there exists an n such that there is a list whose length is n + 1", which seems like very long winded way of saying "a list whose length is not zero", which is is also more long winded than "a list which is not nil".

Owen Shepherd (Jan 11 2026 at 18:01):

Eric Wieser said:

By IList α (1 + n) I assume you mean docs#List.Vector α (1 + n)?

This is exactly what I meant. I didn't know this existed, though it looks like it's in mathlib, not batteries, so I couldn't use it anyway?
To me it looks like List.Vector is strictly more expressive than NonEmpty, so introducing the latter seems unnecessary. The more list-like types there are, the more code there is, and there's an n*n problem for conversions between them, no?

Eric Wieser (Jan 11 2026 at 18:05):

I think moving List.Vector to batteries would be very reasonable, but that's a discussion for another thread.

(Generally things are moved only when someone needs them)

François G. Dorais (Jan 11 2026 at 21:49):

Eric Wieser said:

I think it's worth remembering that the context of this discussion is:

First, this is not the context of this discussion and I tried to make that clear. It is, however, the most recent time I've seen ad-hoc nonempty list types.

(FWIW, I wouldn't recommend using List1 even if it did exist in any form, your option 1 is the better one in this case. That is what I will suggest once I get around to reviewing that PR.)

Eric Wieser (Jan 11 2026 at 21:51):

Apologies, I misinterpreted what you said above.

Eric Wieser (Jan 11 2026 at 21:51):

To be clear, in your most recent message you mean "I wouldn't recommend List1 ... for that specific PR"? Presumably you have other cases in mind where you would use it?

François G. Dorais (Jan 11 2026 at 22:03):

I'm pretty sure you've also seen ad hoc nonempty list types as well, most are 0a and 0b. The primary question is whether 1a/b or 2a/b may be better than the status quo.

I do have List1 in my personal stuff (which I basically copied into batteries#1607). My main use case was to implement free categories.

Eric Wieser (Jan 11 2026 at 22:06):

Is docs#FreeSemigroup morally a good tool for free categories?

Eric Wieser (Jan 11 2026 at 22:07):

(of course List is isomorphic to FreeMonoid, but that doesn't make List the wrong choice everywhere!)

François G. Dorais (Jan 11 2026 at 22:12):

Actually, I eventually found that a path type was what I needed but we're getting sidetracked. This is Batteries, free constructions are very common in programming (though most don't care much about the mathematical theory).

François G. Dorais (Jan 11 2026 at 22:54):

Owen Shepherd said:

Maybe also to be able to get the head of the result without introducing an error codepath?
That's generally the reason to use Data.List.NonEmpty in Haskell.

That doesn't work in Lean since it uses eager evaluation instead of lazy evaluation, like Haskell.

Owen Shepherd (Jan 12 2026 at 03:11):

It's not about laziness.
Data.List.head is non-total, whereas Data.List.NonEmpty.head is total, for obvious reasons.

I'm still not advocating for a non-empty list type in lean though, my preference would be to have List.Vector moved to batteries, so I could return Σ n, List.Vector a (succ n) or something like that.

This seems more... constructive, than returning a list and a proof that it's non-empty... But I'm new to this :)

suhr (Jan 12 2026 at 03:26):

Σ n, List.Vector a (succ n) is a bit like ∃x, n = k + x. It works, but can we something like ≤ for lists?

suhr (Jan 12 2026 at 03:28):

In addition toList.Vector α n = { l : List α // l.length = n } you could have List.Something α n = { l : List α // n ≤ l.length }

Robin Arnez (Jan 12 2026 at 11:44):

I'll note that Σ n, List.Vector a (succ n) is also somewhat significantly different in terms of runtime representation to List: It stores a pair of a natural number and a List of that length. That means if you use Σ n, List.Vector a (succ n) you have to compute the length in addition to knowing that the list is nonempty. For just representing the fact that a list is nonempty I'd prefer if a nonempty list type was at least as efficient to store and manipulate as List.

Last updated: Feb 28 2026 at 14:05 UTC