Transvections #
THIS FILE IS SYNCHRONIZED WITH MATHLIB4. Any changes to this file require a corresponding PR to mathlib4.
Transvections are matrices of the form 1 + std_basis_matrix i j c
, where std_basis_matrix i j c
is the basic matrix with a c
at position (i, j)
. Multiplying by such a transvection on the left
(resp. on the right) amounts to adding c
times the j
-th row to to the i
-th row
(resp c
times the i
-th column to the j
-th column). Therefore, they are useful to present
algorithms operating on rows and columns.
Transvections are a special case of elementary matrices (according to most references, these also contain the matrices exchanging rows, and the matrices multiplying a row by a constant).
We show that, over a field, any matrix can be written as L ⬝ D ⬝ L'
, where L
and L'
are
products of transvections and D
is diagonal. In other words, one can reduce a matrix to diagonal
form by operations on its rows and columns, a variant of Gauss' pivot algorithm.
Main definitions and results #
-
transvection i j c
is the matrix equal to1 + std_basis_matrix i j c
. -
transvection_struct n R
is a structure containing the data ofi, j, c
and a proof thati ≠ j
. These are often easier to manipulate than straight matrices, especially in inductive arguments. -
exists_list_transvec_mul_diagonal_mul_list_transvec
states that any matrixM
over a field can be written in the formt_1 ⬝ ... ⬝ t_k ⬝ D ⬝ t'_1 ⬝ ... ⬝ t'_l
, whereD
is diagonal and thet_i
,t'_j
are transvections. -
diagonal_transvection_induction
shows that a property which is true for diagonal matrices and transvections, and invariant under product, is true for all matrices. -
diagonal_transvection_induction_of_det_ne_zero
is the same statement over invertible matrices.
Implementation details #
The proof of the reduction results is done inductively on the size of the matrices, reducing an
(r + 1) × (r + 1)
matrix to a matrix whose last row and column are zeroes, except possibly for
the last diagonal entry. This step is done as follows.
If all the coefficients on the last row and column are zero, there is nothing to do. Otherwise, one can put a nonzero coefficient in the last diagonal entry by a row or column operation, and then subtract this last diagonal entry from the other entries in the last row and column to make them vanish.
This step is done in the type fin r ⊕ unit
, where fin r
is useful to choose arbitrarily some
order in which we cancel the coefficients, and the sum structure is useful to use the formalism of
block matrices.
To proceed with the induction, we reindex our matrices to reduce to the above situation.
The transvection matrix transvection i j c
is equal to the identity plus c
at position
(i, j)
. Multiplying by it on the left (as in transvection i j c ⬝ M
) corresponds to adding
c
times the j
-th line of M
to its i
-th line. Multiplying by it on the right corresponds
to adding c
times the i
-th column to the j
-th column.
Equations
- matrix.transvection i j c = 1 + matrix.std_basis_matrix i j c
A transvection matrix is obtained from the identity by adding c
times the j
-th row to
the i
-th row.
A structure containing all the information from which one can build a nontrivial transvection. This structure is easier to manipulate than transvections as one has a direct access to all the relevant fields.
Instances for matrix.transvection_struct
- matrix.transvection_struct.has_sizeof_inst
- matrix.transvection_struct.nonempty
Associating to a transvection_struct
the corresponding transvection matrix.
The inverse of a transvection_struct
, designed so that t.inv.to_matrix
is the inverse of
t.to_matrix
.
Given a transvection_struct
on n
, define the corresponding transvection_struct
on n ⊕ p
using the identity on p
.
Given a transvection_struct
on n
and an equivalence between n
and p
, define the
corresponding transvection_struct
on p
.
Reducing matrices by left and right multiplication by transvections #
In this section, we show that any matrix can be reduced to diagonal form by left and right
multiplication by transvections (or, equivalently, by elementary operations on lines and columns).
The main step is to kill the last row and column of a matrix in fin r ⊕ unit
with nonzero last
coefficient, by subtracting this coefficient from the other ones. The list of these operations is
recorded in list_transvec_col M
and list_transvec_row M
. We have to analyze inductively how
these operations affect the coefficients in the last row and the last column to conclude that they
have the desired effect.
Once this is done, one concludes the reduction by induction on the size
of the matrices, through a suitable reindexing to identify any fintype with fin r ⊕ unit
.
A list of transvections such that multiplying on the left with these transvections will replace the last column with zeroes.
Equations
- matrix.pivot.list_transvec_col M = list.of_fn (λ (i : fin r), matrix.transvection (sum.inl i) (sum.inr unit.star()) (-M (sum.inl i) (sum.inr unit.star()) / M (sum.inr unit.star()) (sum.inr unit.star())))
A list of transvections such that multiplying on the right with these transvections will replace the last row with zeroes.
Equations
- matrix.pivot.list_transvec_row M = list.of_fn (λ (i : fin r), matrix.transvection (sum.inr unit.star()) (sum.inl i) (-M (sum.inr unit.star()) (sum.inl i) / M (sum.inr unit.star()) (sum.inr unit.star())))
Multiplying by some of the matrices in list_transvec_col M
does not change the last row.
Multiplying by all the matrices in list_transvec_col M
does not change the last row.
Multiplying by all the matrices in list_transvec_col M
kills all the coefficients in the
last column but the last one.
Multiplying by some of the matrices in list_transvec_row M
does not change the last column.
Multiplying by all the matrices in list_transvec_row M
does not change the last column.
Multiplying by all the matrices in list_transvec_row M
kills all the coefficients in the
last row but the last one.
Multiplying by all the matrices either in list_transvec_col M
and list_transvec_row M
kills
all the coefficients in the last row but the last one.
Multiplying by all the matrices either in list_transvec_col M
and list_transvec_row M
kills
all the coefficients in the last column but the last one.
Multiplying by all the matrices either in list_transvec_col M
and list_transvec_row M
turns
the matrix in block-diagonal form.
There exist two lists of transvection_struct
such that multiplying by them on the left and
on the right makes a matrix block-diagonal, when the last coefficient is nonzero.
There exist two lists of transvection_struct
such that multiplying by them on the left and
on the right makes a matrix block-diagonal.
Inductive step for the reduction: if one knows that any size r
matrix can be reduced to
diagonal form by elementary operations, then one deduces it for matrices over fin r ⊕ unit
.
Reduction to diagonal form by elementary operations is invariant under reindexing.
Any matrix can be reduced to diagonal form by elementary operations. Formulated here on Type 0
because we will make an induction using fin r
.
See exists_list_transvec_mul_mul_list_transvec_eq_diagonal
for the general version (which follows
from this one and reindexing).
Any matrix can be reduced to diagonal form by elementary operations.
Any matrix can be written as the product of transvections, a diagonal matrix, and transvections.
Induction principle for matrices based on transvections: if a property is true for all diagonal matrices, all transvections, and is stable under product, then it is true for all matrices. This is the useful way to say that matrices are generated by diagonal matrices and transvections.
We state a slightly more general version: to prove a property for a matrix M
, it suffices to
assume that the diagonal matrices we consider have the same determinant as M
. This is useful to
obtain similar principles for SLₙ
or GLₙ
.
Induction principle for invertible matrices based on transvections: if a property is true for all invertible diagonal matrices, all transvections, and is stable under product of invertible matrices, then it is true for all invertible matrices. This is the useful way to say that invertible matrices are generated by invertible diagonal matrices and transvections.