Transvections #
Transvections are matrices of the form 1 + stdBasisMatrix i j c
, where stdBasisMatrix i j c
is the basic matrix with a c
at position (i, j)
. Multiplying by such a transvection on the left
(resp. on the right) amounts to adding c
times the j
-th row to the i
-th row
(resp c
times the i
-th column to the j
-th column). Therefore, they are useful to present
algorithms operating on rows and columns.
Transvections are a special case of elementary matrices (according to most references, these also contain the matrices exchanging rows, and the matrices multiplying a row by a constant).
We show that, over a field, any matrix can be written as L * D * L'
, where L
and L'
are
products of transvections and D
is diagonal. In other words, one can reduce a matrix to diagonal
form by operations on its rows and columns, a variant of Gauss' pivot algorithm.
Main definitions and results #
transvection i j c
is the matrix equal to1 + stdBasisMatrix i j c
.TransvectionStruct n R
is a structure containing the data ofi, j, c
and a proof thati ≠ j
. These are often easier to manipulate than straight matrices, especially in inductive arguments.exists_list_transvec_mul_diagonal_mul_list_transvec
states that any matrixM
over a field can be written in the formt_1 * ... * t_k * D * t'_1 * ... * t'_l
, whereD
is diagonal and thet_i
,t'_j
are transvections.diagonal_transvection_induction
shows that a property which is true for diagonal matrices and transvections, and invariant under product, is true for all matrices.diagonal_transvection_induction_of_det_ne_zero
is the same statement over invertible matrices.
Implementation details #
The proof of the reduction results is done inductively on the size of the matrices, reducing an
(r + 1) × (r + 1)
matrix to a matrix whose last row and column are zeroes, except possibly for
the last diagonal entry. This step is done as follows.
If all the coefficients on the last row and column are zero, there is nothing to do. Otherwise, one can put a nonzero coefficient in the last diagonal entry by a row or column operation, and then subtract this last diagonal entry from the other entries in the last row and column to make them vanish.
This step is done in the type Fin r ⊕ Unit
, where Fin r
is useful to choose arbitrarily some
order in which we cancel the coefficients, and the sum structure is useful to use the formalism of
block matrices.
To proceed with the induction, we reindex our matrices to reduce to the above situation.
The transvection matrix transvection i j c
is equal to the identity plus c
at position
(i, j)
. Multiplying by it on the left (as in transvection i j c * M
) corresponds to adding
c
times the j
-th row of M
to its i
-th row. Multiplying by it on the right corresponds
to adding c
times the i
-th column to the j
-th column.
Equations
- Matrix.transvection i j c = 1 + Matrix.stdBasisMatrix i j c
Instances For
A transvection matrix is obtained from the identity by adding c
times the j
-th row to
the i
-th row.
A structure containing all the information from which one can build a nontrivial transvection. This structure is easier to manipulate than transvections as one has a direct access to all the relevant fields.
- i : n
- j : n
- hij : self.i ≠ self.j
- c : R
Instances For
Associating to a transvection_struct
the corresponding transvection matrix.
Equations
- t.toMatrix = Matrix.transvection t.i t.j t.c
Instances For
The inverse of a TransvectionStruct
, designed so that t.inv.toMatrix
is the inverse of
t.toMatrix
.
Instances For
M
is a scalar matrix if it commutes with every nontrivial transvection (elementary matrix).
Given a TransvectionStruct
on n
, define the corresponding TransvectionStruct
on n ⊕ p
using the identity on p
.
Equations
- Matrix.TransvectionStruct.sumInl p t = { i := Sum.inl t.i, j := Sum.inl t.j, hij := ⋯, c := t.c }
Instances For
Given a TransvectionStruct
on n
and an equivalence between n
and p
, define the
corresponding TransvectionStruct
on p
.
Equations
- Matrix.TransvectionStruct.reindexEquiv e t = { i := e t.i, j := e t.j, hij := ⋯, c := t.c }
Instances For
Reducing matrices by left and right multiplication by transvections #
In this section, we show that any matrix can be reduced to diagonal form by left and right
multiplication by transvections (or, equivalently, by elementary operations on lines and columns).
The main step is to kill the last row and column of a matrix in Fin r ⊕ Unit
with nonzero last
coefficient, by subtracting this coefficient from the other ones. The list of these operations is
recorded in list_transvec_col M
and list_transvec_row M
. We have to analyze inductively how
these operations affect the coefficients in the last row and the last column to conclude that they
have the desired effect.
Once this is done, one concludes the reduction by induction on the size
of the matrices, through a suitable reindexing to identify any fintype with Fin r ⊕ Unit
.
Multiplying by all the matrices either in listTransvecCol M
and listTransvecRow M
kills
all the coefficients in the last row but the last one.
Multiplying by all the matrices either in listTransvecCol M
and listTransvecRow M
kills
all the coefficients in the last column but the last one.
Multiplying by all the matrices either in listTransvecCol M
and listTransvecRow M
turns
the matrix in block-diagonal form.
There exist two lists of TransvectionStruct
such that multiplying by them on the left and
on the right makes a matrix block-diagonal, when the last coefficient is nonzero.
Inductive step for the reduction: if one knows that any size r
matrix can be reduced to
diagonal form by elementary operations, then one deduces it for matrices over Fin r ⊕ Unit
.
Reduction to diagonal form by elementary operations is invariant under reindexing.
Any matrix can be reduced to diagonal form by elementary operations. Formulated here on Type 0
because we will make an induction using Fin r
.
See exists_list_transvec_mul_mul_list_transvec_eq_diagonal
for the general version (which follows
from this one and reindexing).
Any matrix can be reduced to diagonal form by elementary operations.
Any matrix can be written as the product of transvections, a diagonal matrix, and transvections.
Induction principle for matrices based on transvections: if a property is true for all diagonal matrices, all transvections, and is stable under product, then it is true for all matrices. This is the useful way to say that matrices are generated by diagonal matrices and transvections.
We state a slightly more general version: to prove a property for a matrix M
, it suffices to
assume that the diagonal matrices we consider have the same determinant as M
. This is useful to
obtain similar principles for SLₙ
or GLₙ
.
Induction principle for invertible matrices based on transvections: if a property is true for all invertible diagonal matrices, all transvections, and is stable under product of invertible matrices, then it is true for all invertible matrices. This is the useful way to say that invertible matrices are generated by invertible diagonal matrices and transvections.