Zulip Chat Archive

Stream: new members

Topic: Jordan Canonical Form and similar matrix forms


Stephan Trenn (Dec 30 2025 at 12:23):

I just started working with Lean and as an exercise I was able to proof that a strictly upper triangular matrix is nilpotent (which seems not yet be available in mathlib). What I am actually interested in are proofs concerning the Jordan Canonical Form and the more general Kronecker canonical form for matrix pencils. So my questions are: Is a proof for the existence of the Jordan Canonical Form already available in Mathlib? I suppose the Kronecker Canonical Form is not yet available? And would it make sense to commit the above mentioned proof about strictly upper triangular matrices into Mathlib?

Stephan Trenn (Jan 03 2026 at 22:24):

No one here interested in linear algebra?

Mirek Olšák (Jan 03 2026 at 23:13):

The closest I was able to find to the decomposition to Jordan Canonical Form is the Jordan–Chevalley–Dunford decomposition over perfect fields: Module.End.exists_isNilpotent_isSemisimple. Lean often likes to do things abstractly (if possible with Modules rather than vector spaces, etc.). I think the Jordan Canonical form could be possibly obtained from the decomposition to a nilpotent & semisimple endomorphisms but probably requires some more work.

Stephan Trenn (Jan 04 2026 at 09:38):

Thanks for pointing me to the Jordan-Chevalley-Dunford decomposition, I didn't know about this form. I fully appreciate to proof certain properties in an as much as possible abstract setting, however, the danger is then that LEAN's mathlib becomes harder to use for average mathematicians, who e.g. just want to prove some properties of finite matrices. OK, but it is good to know, that there are quite a few gaps in the area of (finite-dimensional) linear algebra (i.e. matrix results), which I can try to fill by myself.

Riccardo Brasca (Jan 04 2026 at 10:14):

There is nothing wrong about having in mathlib the concrete form of a result, but the point is that it should be deduced by the general one, not proved separately. This ensures that various parts of the library use the same definitions and also shows that the general version is actually usable.

Concerning Jordan's canonical form, I think that it's not completely clear how to write down a concrete statement. Working with matrices is unfortunately a little annoying, especially with submatrices. The reason is that informally one moves freely between indexes of the submatrix and of the big one, in the "obvious" way. In particular I guess the first step here is to write down a formal statement, before even starting to think about the proof.

Riccardo Brasca (Jan 04 2026 at 10:14):

Anyway we also have the classification of finitely generated module over a PID, so the math to prove the theorem is there.

Stephan Trenn (Jan 04 2026 at 10:58):

So maybe a first step would be to make the working with matrices more straightforward (introducing e.g. submatrices, ...) because there are many results in linear algebra which are in fact about certain matrix structure. For example many text-books define the rank of a matrix by first tranforming it via Gauss-Elimination into Reduced Row Echelon Form and then count the number of non-zero rows. This is based on the theorem, that every matrix can be transformed into Reduced Row Echelon Form via elementary row operations. This theorem is really about the matrix representation and cannot (I think) be formulated on an abstract level of linear maps.

Riccardo Brasca (Jan 04 2026 at 11:19):

Submatrices are already there, see for example Mathlib.Data.Matrix.Block. The point is that what we do informally is just difficult to translate formally.

Stephan Trenn (Jan 04 2026 at 12:28):

I wonder what the underlying reason is for "what we do informally is just difficult to translate formally" specificially for Matrix calculus? After all Matrix calculus is taught in the first year at university, so there seems to be not too much abstract formalism involved to understand it?

Riccardo Brasca (Jan 04 2026 at 12:46):

It is related with indexes. Here is a simple example: a 4×44 \times 4 matrices with real coefficients is informally just a table of 1616 real numbers, written M=(mi,j)M = (m_{i,j}). Now, in Lean it is a function M : Fin 4 → Fin 4 → ℝ (here Fin n is the set 0, 1, ..., n-1, the "standard" set with n elements), so you write M i j instead of mi,jm_{i,j}. So far so good. Now, if you want to speak about the 2×22 \times 2 submatrix in the upper left corner, informally you can use this sentence, and call it N=(ni,j)N = (n_{i,j}), and the relation between NN and MM is obvious. If you want to define N in Lean you may try N : Fin 2 → Fin 2 := fun i j ↦ M i j, but this doesn't make sense, since if i : Fin 2 you cannot write M i, as M want a term of type Fin 4! So you have to insert manually the canonical map Fin 2 → Fin 4. This is totally doable, but the map will be there, and you will see it in every statement. Things are even more convoluted if you want to take another submatrix.

There is nothing really difficult here, it's totally doable, but this kind of annoyances tend to appear a lot and in practice they make things quite annoying.

In my opinion the reason behind this is that we just don't say what a matrix is when we work informally, we have a (very good!) intuitive idea and that's enough.

Riccardo Brasca (Jan 04 2026 at 12:53):

Stephan Trenn said:

..., so there seems to be not too much abstract formalism involved to understand it?

This is exactly the problem! It's not abstract, it is very concrete, and experience has shown it's much harder to formalize concrete statements.

Stephan Trenn (Jan 04 2026 at 12:55):

Interesting! So the issue seems to be to have an easy to work with notation of block matrices. For example if I have two large matrices A and B which are decomposed as A = [A11, A12; A21, A22] and B = [B11, B12; B21, B22] then A * B can be obtained by just treating A and B as 2x2 (block-)matrices and apply the multiplication rule for the 2x2 case.

Stephan Trenn (Jan 04 2026 at 12:58):

Riccardo Brasca said:

So you have to insert manually the canonical map Fin 2 → Fin 4

Isn't that what "coercion" is about in lean, do automate these kind of imbeddings?

Riccardo Brasca (Jan 04 2026 at 13:04):

There is no coercion from Fin n to Fin m, since this is meaningful only when n ≤ m (there is the map of course, it's Fin.castLE, but maybe you prefer Fin.castAdd, those are mathematically the same but for Lean are different). And anyway if you want to consider another block things are more complicated.

For example, here is the statement that matrices multiplication can be done block by block (with 4 blocks): Matrix.fromBlocks_multiply. It's not that bad, but you need this fromBlocks, and if you have several blocks (like in Jordan form) things can be really painful.

Riccardo Brasca (Jan 04 2026 at 13:05):

But again, I am not saying that it isn't doable, but that first of all one should think about a reasonable statement, the proof will come later. There are surely various discussion about Jordan's canonical form.

Stephan Trenn (Jan 04 2026 at 13:19):

So looking at Mathlib.Data.Matrix.Block, a 2x2 block matrix is defined just as a "flat" data structure consisting of four submatrices. I think the first step would be to introduce a general Block-Matrix Type which is a Matrix with different types in each entry, so something like "BlockMatrix (n1, n2, ..., nk) (m1, m2, ..., ml) \alpha" where each ni and mi are Fin-types. Is there actually a Fin-Fin type consisting of finite sequence of finite sequences? I guess this has already been discussed before?

Riccardo Brasca (Jan 04 2026 at 13:22):

If T is any type the best way of saying " an ordered n-tuple of terms of type T is a function Fin n → T".

Stephan Trenn (Jan 04 2026 at 13:36):

Hmm, but n : Fin k -> Fin n does't work because each "n k" is not of type Fin n, but each entry has its own length, actually we are interested in tuples of finite length where the entries add up to n. So we need a Type parametrized by n (the total number of columns / row) which consists of all finite tuples of natural numbers whose entries add up to n.

Riccardo Brasca (Jan 04 2026 at 13:44):

You are starting to hit the "annoyances" I mentioned above. Usually the best way is to not use Fin n to parametrize matrices, but use any Fintype (as it is done in the theorem about block multiplication). The target will be a matrix indexed by some Σ type.

Eric Wieser (Jan 04 2026 at 22:38):

I started very briefly down this path long ago when I defined docs#Matrix.blockDiagonal'

Eric Wieser (Jan 04 2026 at 22:42):

A piece of the puzzle you might want is

/-- A block matrix produced from multiple blocks.

Compare to `Matrix.fromBlocks` which is for just two blocks. -/
def Matrix.sigmaBlocks {ι κ : Type*} {m : ι  Type*} {n : κ  Type*}
    (M :  i j, Matrix (m i) (n j) α) : Matrix (Σ i, m i) (Σ i, n i) α :=
  of fun i, i' j, j' => M i j i' j'

Eric Wieser (Jan 04 2026 at 22:50):

Or maybe (docs#DMatrix):

/-- A block matrix produced from multiple blocks.

Compare to `Matrix.fromBlocks` which is for just two blocks,
or `Matrix.comp` which is for equally-sized blocks. -/
def DMatrix.comp : (DMatrix _ _ fun i j => Matrix (m i) (n j) α)  Matrix (Σ i, m i) (Σ i, n i) α where
  toFun M := .of fun i, i' j, j' => M i j i' j'
  invFun M := fun i j => .of fun i' j' => M i, i' j, j'
  left_inv _ := rfl
  right_inv _ := rfl

/-- `DMatrix.comp` as an additive equivalence. -/
@[simps!]
def DMatrix.compAddEquiv [Add α] :
    (DMatrix _ _ fun i j => Matrix (m i) (n j) α) ≃+ Matrix (Σ i, m i) (Σ i, n i) α where
  __ := DMatrix.comp
  map_add' _ _ := rfl

Stephan Trenn (Jan 05 2026 at 07:01):

Thanks for sharing the code. I was thinking that maybe a recursive approach to handle block matrices would work better, where a block matrix is either a matrix (leaf-case) or a 2x2 matrix consisting of block matrices. A similar structure could also be applied to the underlying vector space, which is defined as a direct sum of either a vector space (leaf-case) or another direct sum. Such a recursive direct sum of the vector space immediately defines a corresponding recursive block matrix representation of any linear map. Does this make sense?

Eric Wieser (Jan 05 2026 at 07:50):

I suspect that is not going to be convenient in practice, and using sigma types is the better bet

Stephan Trenn (Jan 05 2026 at 08:11):

I haven't worked with sigma types yet and it may be indeed easier to work with them. However, for the use cases I have in mind, it seems more natural to have a recursive structure. For example, one approach to obtain the Jordan Canonical form is to first split the underlying vector space into the generalized (invariant) eigenspaces. This splitting already provides a high-level block matrix (and already on that level, you have a recursive structure: You pick one eigenvalue, define the generalized eigenspace and decompose the vector space into the direct sum of this eigenspace and the rest). As a next refinement step, you further decompose the individual generalized eigenspaces according to a filtering procedure into the smaller invariant subspaces corresponding to individual Jordan-blocks. Having a recursive block matrix structure makes this process very easy to implement I think.

Riccardo Brasca (Jan 05 2026 at 08:18):

I also think that sigma types are the way to go, but I may be wrong. The only way to know is to try and see what happens.

Desmond Coles (UT Austin) (Jan 07 2026 at 04:46):

Hello! I found this thread and I am glad to see if it has recently been active. Does anyone know if there is anyone working on RREF or row operations in lean? Or if there is any available code for these kinds of constructions? I am working on another library for formally verifying software, and I need some way to solve a linear system in lean (or determine that system is inconsistent). It seems like it should be doable, though maybe cumbersome, to prove that RREF exists constructively and then use this to solve the linear system.

Stephan Trenn (Jan 07 2026 at 08:52):

It seems there is no result concerning RREF in Mathlib directly and the reason may be similar to why there is not yet a result about the Jordan Canonical form: It is not yet clear, what the "best" way is to actually formalize the structure defining the corresponding normal forms. As mentioned above, it seems to me that a recursive block-structure would be easiest to implement (also for the proof). Specifically for the RREF, one could try to come up with a recursive definition, at least for the non-canonical RREF (where there is not yet the requirement that there should be zeros above the pivots) this should be straight-forward and the proof of existence is also straight-forward based on this recursion.


Last updated: Feb 28 2026 at 14:05 UTC