Zulip Chat Archive
Stream: Is there code for X?
Topic: Singular Value Decomposition
Niels Voss (Aug 04 2025 at 18:19):
Is there code for or a plan for singular value decomposition (of matrices or linear maps)? I found #6042 by @MohanadAhmed, but it seems that it's not general enough to be merged as-is. I was wondering if anything had replaced it yet, or if there was a plan to finish it. If not, what's the ideal generalization of SVD for mathlib (I assume there are ways to generalize SVD for finite-dimensional inner product spaces to maybe something that is not finite-dimensional)?
Jireh Loreaux (Aug 04 2025 at 23:45):
There the Schmidt decomposition for compact operators on a Hilbert space, but that's different enough that we should have both, not try to shoehorn SVD to fit that mold.
Niels Voss (Aug 05 2025 at 00:09):
Is the schmidt decomposition in mathlib, or are you just saying that it would be a candidate for a generalization of SVD?
Niels Voss (Aug 05 2025 at 00:30):
Can the schmidt decomposition be generalized further to infinite dimensions (I only saw the finite-dimensional case on wikipedia) or to larger tensor products than just two spaces, or is it as generalized as it will probably get?
Niels Voss (Aug 23 2025 at 19:01):
Would singular value decomposition of linear maps between finite-dimensional inner product spaces be on topic for mathlib? Or would it need to be generalized to the infinite-dimensional case? We only have spectral theorem for linear maps between finite-dimensional spaces
Niels Voss (Aug 23 2025 at 19:03):
I was considering something like
variable {π : Type*} [RCLike π]
variable {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π E]
variable {F : Type*} [NormedAddCommGroup F] [InnerProductSpace π F]
variable {n : β} (hn : Module.finrank π E = n)
variable (T : E ββ[π] F)
variable [FiniteDimensional π E] [FiniteDimensional π F]
noncomputable def LinearMap.singularValues (i : Fin n) : β :=
Real.sqrt ((LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues hn i)
(where LinearMap.isSymmetric_adjoint_comp_self comes from #Is there code for X? > T*T is positive when domain and codomain are different) This definition has the nice property of preserving multiplicities of singular values, as opposed to something like a HasSingularValue predicate, which wouldn't.
YaΓ«l Dillies (Aug 23 2025 at 19:09):
Niels Voss said:
Would singular value decomposition of linear maps between finite-dimensional inner product spaces be on topic for mathlib? Or would it need to be generalized to the infinite-dimensional case?
Would generalising to the infinite-dimensional case involve throwing away (most of) the finite-dimensional proof? If not, then mathlib will be happy to have finite-dimensional SVD.
Niels Voss (Aug 23 2025 at 19:11):
To be honest, I don't personally know how to generalize SVD to the infinite-dimensional case. Do you know what the "best" generalization is?
Niels Voss (Aug 23 2025 at 20:36):
Also, for T : E -> F, when mathematicians say "list of singular values", does this list generally
- have length rank(T), so that all the singular values are positive
- have length min(dim(E), dim(F))
- have length dim(E) (as it does in LADR 4th edition)
- have countably infinite length, but all but finitely many of the singular values are zero?
I'm trying to decide whether indexing the list of singular values by Fin n or Nat is a better idea (and if the latter, whether or not I should Finsupp)
Niels Voss (Aug 23 2025 at 20:56):
There's also this alternative formulation from https://en.wikipedia.org/wiki/Singular_value
image.png
Niels Voss (Aug 24 2025 at 00:00):
I changed it to this definition:
variable {π : Type*} [RCLike π]
variable {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π E]
variable {F : Type*} [NormedAddCommGroup F] [InnerProductSpace π F]
variable (T : E ββ[π] F)
variable [FiniteDimensional π E] [FiniteDimensional π F]
noncomputable def LinearMap.singularValues : β ββ β :=
Finsupp.embDomain Fin.valEmbedding <|
Finsupp.ofSupportFinite
(Real.sqrt β (LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues rfl)
(Set.toFinite _)
with the characterization
theorem LinearMap.apply_singularValues_of_lt {n : β} (h : n < Module.finrank π E)
: T.singularValues n
= β((LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues rfl β¨n, hβ©) := sorry
theorem LinearMap.apply_singularValues_of_le {n : β} (h : Module.finrank π E β€ n)
: T.singularValues n = 0 := sorry
theorem LinearMap.apply_singularValues_fin {n : β} (hn : Module.finrank π E = n) (i : Fin n)
: T.singularValues βi = β((LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues hn i) := sorry
SΓ©bastien GouΓ«zel (Aug 24 2025 at 08:04):
I'd rather allow infinite dimension (which means that, in general, singular values don't form a finitely supported sequence). For instance, the singular values of the Sobolev injections are interesting objects, as they quantify "how much compact" the injection is. More generally, approximation numbers (the inf of the norm of T - L for finite rank L) are very much relevant in this setting, as they make sense for Sobolev spaces not based on L^2.
Jireh Loreaux (Aug 24 2025 at 14:02):
I think defining approximation numbers would be great. The thing to prove, as far as I'm concerned, with respect to SVD would be that for a compact operator there are two orthonormal sequences such that the compact operator is the norm convergent sum of the rank-1 elementary tensors induced by pairing these vectors and multiplying by the approximation numbers.
Niels Voss (Aug 24 2025 at 15:06):
Since I don't know the actual math yet: Are approximation numbers defined for continuous linear maps only or all linear maps? And can they be infinite? Should the codomain be Real, NNReal, or ENNReal?
SΓ©bastien GouΓ«zel (Aug 24 2025 at 16:04):
It's only defined for continuous linear maps (for non-continuous linear maps, all the approximation numbers would be infinite, so they wouldn't be interesting at all). Note that in finite dimension, all linear maps are continuous, so there's no loss in generality.
Niels Voss (Aug 24 2025 at 16:06):
To confirm, for non-continuous linear maps, every approximation number is infinite because subtracting a finite-rank linear map will never result it it becoming continuous, and all non-continuous maps have infinite norm, right?
Niels Voss (Aug 24 2025 at 16:09):
Defining it only for continuous linear maps would make it slightly more annoying to use, because the conversion between linear and continuous linear maps is not automatic even for finite dimensions. This is why we have LinearMap.adjoint in addition to ContinuousLinearMap.adjoint
SΓ©bastien GouΓ«zel (Aug 24 2025 at 16:25):
I would argue that it's more natural to define it only for continuous linear maps, even in finite dimension: it's an analytical notion, depending on the norm, so naturally the realm of continuous linear maps.
Jireh Loreaux (Aug 24 2025 at 16:56):
SΓ©bastien, I think that's only half true. People will certainly want SVD for matrices without caring much about the norm / topology.
Niels Voss (Aug 24 2025 at 16:58):
If the only cost is that people have to do T.toContinuousLinearMap.singularValues instead of T.singularValues it isn't that bad
SΓ©bastien GouΓ«zel (Aug 24 2025 at 17:17):
I think that for matrices we should have a shortcut indeed. Note that matrices are not the same as linear maps: to me, one should have them for continuous linear maps because that's the right conceptual framework, and then for matrices because it's used so often.
Niels Voss (Aug 24 2025 at 17:19):
And nothing for regular linear maps, right?
Jireh Loreaux (Aug 24 2025 at 20:35):
Great, sounds like we're on the same page then.
Kim Morrison (Aug 24 2025 at 22:56):
I love the plan, but I hope that if there is not an implementation plan (i.e. a person saying they are doing it in a time frame) that this would not get in the way of someone contributing an implementation for matrices, which there is demand for and has been for a while. I think it should suffice in an implementation for matrices to include a module doc link to this thread, giving a pointer to the proposed upgrade path.
Niels Voss (Aug 26 2025 at 17:34):
Should the codomain of ContinuousLinearMap.singularValue be Real or NNReal?
Jireh Loreaux (Aug 26 2025 at 18:14):
I would be tempted to define it as β β ββ₯0 as subtraction plays a lesser role in the context of singular values. Nonnegativity is more important I think.
Niels Voss (Aug 26 2025 at 18:55):
Does this look like a reasonable definition then?
/--
When the vector spaces are not necessarily finite, these are called the approximation numbers.
Note: we use β€ n instead of < n because we start the singular values at 0, not 1, so if we did
< n the first singular value would be repeated.
-/
noncomputable def ContinuousLinearMap.singularValue {π : Type*} [NontriviallyNormedField π]
{X : Type*} [SeminormedAddCommGroup X] [NormedSpace π X]
{Y : Type*} [SeminormedAddCommGroup Y] [NormedSpace π Y]
(T : X βL[π] Y) (n : β) : ββ₯0 :=
-- NOTE: We can't use `β¨
H β {...}` because of
-- https://leanprover-community.github.io/extras/pitfalls.html#accidental-double-iinf-or-isup
β¨
H : {S : X βL[π] Y // S.rank β€ n}, βT - Hββ
Niels Voss (Aug 28 2025 at 05:58):
What way would you recommend to define singular values for Matricies? For a matrix M, I can either say that
- The singular values of M are the singular values of the operator T produced by converting M to a linear map between
EuclideanSpace k mandEuclideanSpace k n(probably with docs#Matrix.toEuclideanLin). This basically the approachMatrix.IsHermitian.eigenvaluesβuses. Requires M to be finite (or at least needs M to have a finite number of columns) and the field to be RCLike. - The singular values of M are defined by
inf {||M - N|| : rank(N) < n}where we use the norm defined inMathlib.Analysis.CStarAlgebra.Matrix. Requires M to be finite and the field to be RCLike. - The singular values of M are defined by the square roots of the eigenvalues of T*T, produced by docs#Matrix.IsHermitian.eigenvaluesβ. Requires M to be finite and the field to be RCLike.
All of these approaches only work for finite-dimensional Real and Complex. Between these three approaches, which is the best, and is there a different approach that can be used? If only finite matrices are supported, should the definition be in Finsupp, like above?
Niels Voss (Aug 28 2025 at 06:12):
The first option would probably be something like
open NNReal in
noncomputable def Matrix.singularValue {π : Type*} [RCLike π]
{n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π)
: β β ββ₯0 := M.toEuclideanLin.toContinuousLinearMap.singularValue
or
open NNReal in
noncomputable def Matrix.singularValue {π : Type*} [RCLike π]
{n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π)
: β ββ ββ₯0 :=
Finsupp.ofSupportFinite
M.toEuclideanLin.toContinuousLinearMap.singularValue
sorry
(I can probably get rid of the [DecidableEq m] since it is noncomputable anyways)
Niels Voss (Aug 30 2025 at 20:04):
Jireh Loreaux said:
I think defining approximation numbers would be great. The thing to prove, as far as I'm concerned, with respect to SVD would be that for a compact operator there are two orthonormal sequences such that the compact operator is the norm convergent sum of the rank-1 elementary tensors induced by pairing these vectors and multiplying by the approximation numbers.
Sorry if this is a stupid question, but since I don't know the field that well, can you confirm whether this holds true between arbitrary normed spaces, arbitrary Banach spaces, Banach spaces that satisfy the Approximation Property, or just Hilbert spaces? And would a proof for just hilbert spaces be accepted, or would it need to be as general as possible? My understanding is that the standard proof for Hilbert spaces depends on the spectral theorem for Hilbert spaces, which does not yet exist in mathlib. Also, does this sum converge conditionally, unconditionally but not absolutely, or absolutely?
Jireh Loreaux (Aug 30 2025 at 20:42):
The fact about the existence of orthonormal sequences only holds in the context of Hilbert spaces. @FrΓ©dΓ©ric Dupuis and @Heather Macbeth at one point had the spectral theorem for compact (selfadjoint? probably not normal) operators, but I think it may be Lean 3 code and porting it was never finished. It would be nice to make that available. The sum converges in operator norm.
FrΓ©dΓ©ric Dupuis (Aug 30 2025 at 21:01):
Yes, I haven't forgotten about this! :-) Indeed it's Lean 3 code, and it was written before we had compact operators in mathlib, so it's based on a less general definition that had just enough API to get through this proof. Every once in a while I tell myself I should port it, and then I get distracted by other things.
Niels Voss (Aug 30 2025 at 21:04):
In that case, is SVD most likely blocked behind the spectral theorem? I don't mind postulating the spectral theorem for the SVD so that they can be worked on in parallel.
Niels Voss (Aug 31 2025 at 18:59):
I have the following extremely rough draft of the statement of SVD for matrices (assuming singular values have already been defined for linear maps as above). The names and which parameters are implicit will certainly need to change. But I want to know if I'm at all on the right track or not.
The idea is that if we are clever, we can make the standard SVD, thin SVD, compact SVD, and maybe even truncated SVD all the same definition. Given a m x n matrix M, we decompose M into M = UΞ£V* where
- U is m x a
- Ξ£ is a x b
- V* is b x n
for some a and b.
- If
a = mandb = n, then this gives the standard SVD. - If
a = b = min(m,n), then this gives the thin SVD. - If
a = b = rank(M), then this gives the compact SVD. - If
a = b < rank(M), then this gives the truncated SVD.
However, I wasn't sure how to handle the fact that m,n,a,b are arbitrary, not-necessarily ordered, finite types. To do this, I postulate injective functions f : a -> Nat and g : b -> Nat, which will normally be set to Fin.val in case a and b are Fin _. An alternative option would be to postulate r : Nat and f : Fin r -> a, g : Fin r -> b.
In any case, I would like to know if this is at all a reasonable approach or if something major needs to be changed.
open NNReal in
noncomputable def Matrix.singularValue {π : Type*} [RCLike π]
{n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π)
: β β ββ₯0 :=
M.toEuclideanLin.toContinuousLinearMap.singularValue
-- TODO: Fix names
noncomputable def Matrix.leftsing {π : Type*} [RCLike π]
{m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
(a : Type*) [Fintype a] (f : a β β)
(M : Matrix m n π)
: Matrix m a π := sorry
noncomputable def Matrix.rightsing {π : Type*} [RCLike π]
{m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
(b : Type*) [Fintype b] (g : b β β)
(M : Matrix m n π)
: Matrix n b π := sorry
noncomputable def Matrix.centersing {π : Type*} [RCLike π]
{m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
(M : Matrix m n π)
(a : Type*) [Fintype a] (f : a β β) (b : Type*) (g : b β β)
: Matrix a b π := Matrix.of
fun i j β¦ if f i = g j then RCLike.ofReal (M.singularValue (f i)).toReal else 0
theorem Matrix.SVD {π : Type*} [RCLike π]
{m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
(a : Type*) [Fintype a] (f : a β β) (hf : f.Injective)
(b : Type*) [Fintype b] (g : b β β) (hg : g.Injective)
(M : Matrix m n π)
(hfβ : M.singularValue.support β Set.range f) (hgβ : M.singularValue.support β Set.range g)
: M.leftsing a f * M.centersing a f b g * (M.rightsing b g).conjTranspose = M := sorry
Niels Voss (Aug 31 2025 at 19:14):
Also, most of the time when SVD is used in proofs, it seems to go like "Let UΞ£V* be a singular value decomposition, ...". Are there any proofs which go "Notice that UΞ£V* meets the criteria for a singular value decomposition of M, so it must satisfy property X"?
Niels Voss (Sep 02 2025 at 05:34):
@FrΓ©dΓ©ric Dupuis Is this the most recent version of your spectral theorem code? https://github.com/leanprover-community/mathlib3/tree/dupuisf/linear_map_spectral I just want to get a general sense of what the API will look like for the spectral theorem
FrΓ©dΓ©ric Dupuis (Sep 02 2025 at 12:57):
I think the latest one is https://github.com/leanprover-community/mathlib3/tree/semilinear-paper-update
Niels Voss (Sep 03 2025 at 21:41):
I've had a lot of difficulty finding what I consider "good" definitions for singular values and SVD. I think I've settled on a few definitions that I consider reasonably optimal, but would it be possible to request a review of these definitions and theorem statements before I commit a bunch of time to trying to prove them? Or would it be best to try proving them first, and then get review of the definitions during the PR review?
Ruben Van de Velde (Sep 03 2025 at 22:28):
You free to ask, but I suspect one of the questions will be "how easy is it to prove the first few dozen basic lemmas about the definition"
Li Jiale (Sep 04 2025 at 02:12):
(deleted)
Niels Voss (Sep 05 2025 at 18:50):
There is a more general version of SVD which is defined for arbitrary bounded operators between Hilbert spaces, as described in https://www.mdpi.com/2227-7390/8/8/1346. However, I don't think that mathlib has the necessary tools to prove this (in particular, we need spectral theorem for bounded self-adjoint operators between Hilbert spaces). Is it fine if I just work on the compact operator SVD for Hilbert spaces instead of trying to do the most general version?
Jireh Loreaux (Sep 05 2025 at 20:30):
- I think above I mentioned explicitly the compact case, so yes, go with that.
- The SVE mentioned in that paper I think is actually not particularly useful, which I argue is essentially because
H βL[β] His type I as a von neumann algebra, and so what we really want is atomic (i.e., a sequence), not some measurable function. - In the setting of type II von Neumann algebras there is a function
ΞΌ : Set.Ici 0 β ββ₯0β(which is finite if the von Neumann algebra is finite) which plays the same role as the singular values. In the case when the algebra is infinite,ΞΌ(just like the singular values) is only interesting in the case when the operator is "compact" (i.e., is a norm limit of operators whose ranges have finite trace).
Jireh Loreaux (Sep 05 2025 at 20:31):
@Niels Voss where are your definitions exactly?
Niels Voss (Sep 05 2025 at 21:12):
I can post the formal definitions in a second but in the meantime here's an informal sketch that I wrote up in obsidian
image.png
Jireh Loreaux (Sep 05 2025 at 21:18):
seems reasonable.
Niels Voss (Sep 05 2025 at 21:29):
Here's what I have right now for formal definitions. (I'm not sure about the isometries in the "SVD" section, the idea is that maybe they could be used to prove a more familiar version of SVD. I'm considering deleting the ContinuousLinearMap.___SingularMatrix definitions because I think they are redundant with the Matrix.___SingularMatrix definitions.)
import Mathlib
open NNReal in
noncomputable def ContinuousLinearMap.singularValue {π : Type*} [NontriviallyNormedField π]
{X : Type*} [SeminormedAddCommGroup X] [NormedSpace π X]
{Y : Type*} [SeminormedAddCommGroup Y] [NormedSpace π Y]
(T : X βL[π] Y) (n : β) : ββ₯0 :=
-- NOTE: We can't use `β¨
H β {...}` because of
-- https://leanprover-community.github.io/extras/pitfalls.html#accidental-double-iinf-or-isup
-- Also, this involves a coercion Nat β Cardinal, so I might redo that part of the definition
-- later.
β¨
H : {S : X βL[π] Y // S.rank β€ n}, βT - Hββ
open NNReal in
noncomputable def Matrix.singularValue {π : Type*} [RCLike π]
{n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π)
: β β ββ₯0 :=
M.toEuclideanLin.toContinuousLinearMap.singularValue
section SVD
-- TODO: When I wrote up this section for the first time, I forgot that this only works for compact
-- linear maps, and not just continuous linear maps. So the following definitions will need an
-- extra hypothesis stating T is compact, and will probably live in the IsCompactOperator namespace
-- instead of the ContinuousLinearMap namespace.
variable {π : Type*} [RCLike π]
{E : Type*} [NormedAddCommGroup E] [InnerProductSpace π E] [CompleteSpace E]
{F : Type*} [NormedAddCommGroup F] [InnerProductSpace π F] [CompleteSpace F]
(T : E βL[π] F)
noncomputable def ContinuousLinearMap.rightSingularVector : β β E :=
have := T -- temporary trick to make sure that the definition includes T as a parameter
sorry
noncomputable def ContinuousLinearMap.leftSingularVector : β β F :=
have := T
sorry
-- TODO: Rename
open InnerProductSpace in
theorem ContinuousLinearMap.SVD (v : E)
: T v = β' n : β, (RCLike.ofReal β(T.singularValue n) : π) β’
βͺv, T.rightSingularVector nβ«_π β’ T.leftSingularVector n := sorry
/--
The first `dim(E)` vectors of `ContinuousLinearMap.rightSingularVector` but bundled into an
orthonormal basis.
-/
noncomputable def ContinuousLinearMap.rightSingularOrthonormalBasis
[FiniteDimensional π E] {n : β} (hn : Module.finrank π E = n) : OrthonormalBasis (Fin n) π E :=
OrthonormalBasis.mk (v := T.rightSingularVector β Fin.val) sorry sorry
/--
The first `dim(F)` vectors of `ContinuousLinearMap.leftSingularVector` but bundled into an
orthonormal basis.
-/
noncomputable def ContinuousLinearMap.leftSingularOrthonormalBasis
[FiniteDimensional π F] {m : β} (hn : Module.finrank π F = m) : OrthonormalBasis (Fin m) π F :=
OrthonormalBasis.mk (v := T.leftSingularVector β Fin.val) sorry sorry
end SVD
section MatrixSVD
variable {π : Type*} [RCLike π]
{m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
-- a and b should be thought of as subtypes of β.
{a b : Type*} [Fintype a] [Fintype b]
(M : Matrix m n π)
variable
{E : Type*} [NormedAddCommGroup E] [InnerProductSpace π E] [CompleteSpace E]
{F : Type*} [NormedAddCommGroup F] [InnerProductSpace π F] [CompleteSpace F]
[FiniteDimensional π E] [FiniteDimensional π F]
(T : E βL[π] F)
noncomputable def ContinuousLinearMap.rightSingularMatrix
(rb : Module.Basis n π E) (g : b βͺ β)
: Matrix n b π := Matrix.of fun i j β¦ rb.repr (T.rightSingularVector (g j)) i
noncomputable def ContinuousLinearMap.leftSingularMatrix
(lb : Module.Basis m π F) (f : a βͺ β)
: Matrix m a π := Matrix.of fun i j β¦ lb.repr (T.leftSingularVector (f j)) i
noncomputable def ContinuousLinearMap.centerSingularMatrix
(f : a βͺ β) (g : b βͺ β)
: Matrix a b π := Matrix.of fun i j β¦
if f i = g j then RCLike.ofReal β(T.singularValue (f i)) else 0
theorem ContinuousLinearMap.matrixSVD
(lb : Module.Basis m π F) (rb : Module.Basis n π E) (f : a βͺ β) (g : b βͺ β)
(hf : T.singularValue.support β Set.range f) (hg : T.singularValue.support β Set.range g)
: T.toMatrix rb lb
= T.leftSingularMatrix lb f * T.centerSingularMatrix f g *
(T.rightSingularMatrix rb g).conjTranspose := sorry
noncomputable def Matrix.rightSingularMatrix
(g : b βͺ β)
: Matrix n b π :=
M.toEuclideanLin.toContinuousLinearMap.rightSingularMatrix (Pi.basisFun π n) g
noncomputable def Matrix.leftSingularMatrix
(f : a βͺ β)
: Matrix m a π :=
M.toEuclideanLin.toContinuousLinearMap.leftSingularMatrix (Pi.basisFun π m) f
noncomputable def Matrix.centerSingularMatrix
(f : a βͺ β) (g : b βͺ β)
: Matrix a b π :=
M.toEuclideanLin.toContinuousLinearMap.centerSingularMatrix f g
theorem Matrix.SVD
(f : a βͺ β) (g : b βͺ β)
(hf : M.singularValue.support β Set.range f) (hg : M.singularValue.support β Set.range g)
: M = M.leftSingularMatrix f * M.centerSingularMatrix f g *
(M.rightSingularMatrix g).conjTranspose := sorry
end MatrixSVD
Niels Voss (Sep 05 2025 at 21:33):
The motivation for the nonstandard definition of the matrix SVD is that it is actually general enough to encompass the standard svd, thin svd, compact svd, and truncated svd at the same time.
Niels Voss (Sep 05 2025 at 21:39):
Here's the way it works:
Let M be an mΓn matrix, and let a and b be natural numbers. Then, if rank(M) <= a and rank(M) <= b, then M = UΞ£V*, where:
- U is mΓa (columns of U come from the left singular vectors f_1,f_2,...)
- Ξ£ is aΓb (entries of Ξ£ come from the singular values of M)
- V* is bΓn (columns of V come from the right singular vectors e_1,e_2,...)
We can recover some of the original forms of SVD as follows:
- Full SVD: Set a = m, b = n
- Thin SVD: Set a = b = min(m,n)
- Compact SVD: Set a = b = rank(M)
- Truncated SVD: Set a = b = t < rank(M), except M = UΞ£V* wonβt hold.
Niels Voss (Sep 15 2025 at 16:35):
Since mathlib seems to prefer smaller PRs, could we for instance PR just the definition of the singular values and basic theorems, or would the whole SVD need to be PR'd at once?
Jireh Loreaux (Sep 15 2025 at 16:52):
Surely the PR for singular values can be on its own.
Niels Voss (Sep 15 2025 at 16:54):
Should we have a theorem which states that for the finite dimensional case, the squares of the singular values are the LinearMap.IsSymmetric.eigenvalues of T*T, or should we wait until that gets generalized to infinite dimensions?
Jireh Loreaux (Sep 15 2025 at 17:07):
I think it's fine to add the finite dimensional version, because the infinite dimensional version will need a compactness assumption anyway.
Arnav Mehta (Oct 09 2025 at 20:44):
Hey my name is Arnav I am working with @Niels Voss and we are wondering if we should get in singular values for matrices at the same time as we PR for singular values for linear maps, and also where in Mathlib folder structure should this file be placed
Jireh Loreaux (Oct 10 2025 at 13:55):
I would request that the singular values for matrices comes as a separate PR, especially to reduce the review burden on the initial PR. Getting definitions right is always hard, so making that initial PR do as few things as possible is better.
Jireh Loreaux (Oct 10 2025 at 13:56):
As for where the file should be placed, I think Analysis.Normed.Operator is probably a good folder for this.
Niels Voss (Nov 23 2025 at 08:39):
Sorry for the delays in working on this. Just a few super minor (though hopefully not bike-shedding) questions about the definition of singular values because these are slightly more annoying to change later:
- Should the file be
SingularValues.leanorSingularValue.leanorApproximationNumbers.lean? - Should it be called
ContinuousLinearMap.singularValuesorContinuousLinearMap.singularValueorContinuousLinearMap.approximationNumbers? - It currently uses the
SeminormedAddCommGrouptypeclass. Should it useNormedAddCommGroupinstead? (I chose to use the most general one, because that seems like the Mathlib convention) - Right now it is defined as
public noncomputable def ContinuousLinearMap.singularValue {π : Type*} [NontriviallyNormedField π]
{X : Type*} [SeminormedAddCommGroup X] [NormedSpace π X]
{Y : Type*} [SeminormedAddCommGroup Y] [NormedSpace π Y]
(T : X βL[π] Y) (n : β) : ββ₯0 :=
β¨
H : {S : X βL[π] Y // S.rank β€ βn}, βT - Hββ
where the βn casts n to a Cardinal. But in this topic: #mathlib4 > How to express a linear map has finite rank? some people criticized the use of Cardinal. Should I attempt to avoid Cardinal in the definition?
- Regarding the module system, should the definition be exposed or not? If not, is there an attribute that automatically generates equalities?
Kevin Buzzard (Nov 23 2025 at 14:27):
Because the rank can be infinite, if you want to avoid cardinals you'd have to say "S has finite rank and the finrank <= n" -- what you have looks more natural to me (and less of a mouthful). You might want to make some API to avoid having to think about cardinals e.g. a theorem saying "if S has finite rank and finrank <= n then singularValue <= ||T-H||_+" or maybe even the theorem saying that it's the inf of those things.
Jireh Loreaux (Nov 23 2025 at 18:41):
Analysis/Normed/Operator/SingularValue.leanseems nice.- I think
singularValueis nice for discoverability. I've only heard approximation numbers in relation to operators on Banach spaces, not for matrices. Seminormedis the right choice.Cardinalis fine because we have no notion oferankat this point.
Niels Voss (Nov 23 2025 at 18:43):
And singularValue is preferred over singularValues? I know that there's something like LinearMap.IsSymmetric.eigenvalues, which is plural
Jireh Loreaux (Nov 23 2025 at 18:44):
That's a good point. I don't have a strong opinion about pluralization.
Levi G (Dec 07 2025 at 14:13):
Hello everyone. My name is Levi, a Statistics Student in 3rd year going to 4th year. Previously in another PR, I had attempted to formalize SVD with the help of AI but my code was not general enough and the PR did not meet the Mathlib quality standards.
Levi G (Dec 07 2025 at 14:15):
However, I have made significant progress. I believe that I have something significantly much better than I had before and believe I have addressed almost most of the main issues raised by @Jireh Loreaux in a different thread.
Levi G (Dec 07 2025 at 14:17):
According to what I was told by @Jireh Loreaux, @Niels Voss was the one working on SVD and according to some feedback I got from others, the ideal way of working in contributing to Mathlib is by discussing design of the code with others and coming up with small snippets of code.
Levi G (Dec 07 2025 at 14:19):
I would like to share that progress and get your feedback especially on whether it is the right way of doing things.
Levi G (Dec 07 2025 at 14:23):
The code is quite large which I had been told is not the ideal way of doing things. Anyways, here it is:
T17.1 - PR Prep SVD.lean
Niels Voss (Dec 07 2025 at 22:21):
Hi, for the reasons discussed in this thread, the best formalization of singular values are the approximation numbers defined in #32126. My current plan is to connect these approximation numbers with the square roots of eigenvalues of T*T for a compact operator T between Hilbert Spaces, but that proof is relatively complicated, which is why the PR is currently unfinished.
Levi G (Dec 08 2025 at 04:32):
Niels Voss said:
Hi, for the reasons discussed in this thread, the best formalization of singular values are the approximation numbers defined in #32126. My current plan is to connect these approximation numbers with the square roots of eigenvalues of T*T for a compact operator T between Hilbert Spaces, but that proof is relatively complicated, which is why the PR is currently unfinished.
Ok, understood. Thank you for feedback. You are working on infinite dimensional case which is much more difficult. But what do you think of my proof for the finite dimensional case? (Is it at least directionally useful) I can try fix the defs especially for singular values.
Levi G (Dec 09 2025 at 22:16):
@Niels Voss Here is a blueprint I have started working on that is in line with your approach (though has quite a bit of sorrys). What do you think? Is it the right approach and structure?
Levi G (Dec 09 2025 at 22:45):
Levi G (Dec 10 2025 at 00:15):
The above:point_up: is Work-in-Progress hence the many sorrys. Would love hear your feedback.
Johan Commelin (Dec 10 2025 at 04:37):
What happened with your claim to drop this project?
Levi G said:
Johan Commelin Understood. It's just that I was hoping for some more feedback because this latest version is significantly better than anything that I had given before and also attempted to address their feedback. I hope you can at least take a look at this one. However, I have understood your point, it is a minefield so I will just stop pressing on this matter, hold-off completely and move on to other projects. Thank you for your feedback.
Snir Broshi (Dec 12 2025 at 02:07):
Another SVD implementation #mathlib4 > Matrix analysis
Levi G (Dec 12 2025 at 20:14):
Johan Commelin said:
What happened with your claim to drop this project?
Levi G said:Johan Commelin Understood. It's just that I was hoping for some more feedback because this latest version is significantly better than anything that I had given before and also attempted to address their feedback. I hope you can at least take a look at this one. However, I have understood your point, it is a minefield so I will just stop pressing on this matter, hold-off completely and move on to other projects. Thank you for your feedback.
Sorry @Johan Commelin. After I coming up with the finite-dimensional SVD, I couldn't help myself. I wanted feedback on it and wanted to be of assistance to @Niels Voss who was working on the more general and difficult case of SVD for compact operators on Hilbet Spaces and its API. So I couldn't help myself, please. As of right now I have made significant progress on the general case and will soon be sharing progress if that's alright.
Levi G (Dec 12 2025 at 20:23):
The progress being:
All s-number axioms proven (S2, S3 left/right)
Rayleigh Quotient Theorem - Infinite dimensional case for Compact Self-Adjoint Operators
Spectral Theorem for Compact Self-Adjoint Operators on Hilbert Spaces
Compactness characterization of operators
Levi G (Dec 12 2025 at 20:43):
It's a big file though.
Jireh Loreaux (Dec 12 2025 at 21:39):
@Levi G the fact that it's a big file is part of the problem. At this point, you don't have experience contributing to Mathlib, so a contribution of this magnitude is very hard to review. If you start with smaller projects and learn good design patterns from receiving review, then it gets easier because your code is higher quality. Let me highlight one example as evidence:
theorem IsCompactOperator.eigenvalue_tendsto_zero
(hT : IsCompactOperator T) (_hSA : IsSelfAdjoint T)
{ΞΌ : β β β} {e : β β E}
(he_orth : Orthonormal π e)
(he_eigen : β n, T (e n) = (ΞΌ n : π) β’ (e n))
(hΞΌ_antitone : Antitone (fun n β¦ |ΞΌ n|)) :
Tendsto (fun n β¦ |ΞΌ n|) atTop (π 0) := by
This theorem has nothing to do with selfadjiont operators, or real eigenvalues, or Antitone in any fundamental way. That is, what's really going on here is that since e is orthonormal, is converges weakly to zero, and compact operators take weakly convergent sequences to norm convergent ones.
Again, I would love for you to become a productive member of the community, but please start with smaller projects. It may even be the case that there are peices of your code worth salvaging, but I don't have the energy or the time to try to sift through 2000 lines of code looking for a diamond in the rough.
Levi G (Dec 12 2025 at 22:20):
@Jireh Loreaux Thank you for your feedback and time spent reviewing my code. Long, bad-quality code is hard to review - Understood. My new file is around 3k lines - too long.
Levi G (Dec 12 2025 at 22:24):
In light of the difficulty of reviewing long low quality code - I'll share what I think are the parts of interest in my new code: the Rayleigh Quotient Theorem and Spectral Theorem for Compact Self-Adjoint Operators on Hilbert Spaces. Then hold-off from the project (finally) and move on to smaller project where I can improve.
Levi G (Dec 12 2025 at 22:51):
/-- Key Lemma 2: For a nonzero compact self-adjoint operator, βTβ or -βTβ is an eigenvalue.
The proof uses the Rayleigh quotient: for self-adjoint `T`, the operator norm is the
supremum of the absolute Rayleigh quotient on the unit sphere. For compact `T`, we
build a maximizing sequence on the unit sphere and use compactness to extract a
convergent subsequence, whose limit is an eigenvector.
This theorem is the compact self-adjoint case that Mathlib's `Rayleigh.lean` leaves
as a TODO; here we supply a standalone implementation using `hOpNorm_eq_sSup_absRayleigh`
(proven via the Polarization Identity) as the key spectral-theoretic input.
-/
theorem IsSelfAdjoint.exists_norm_eq_abs_eigenvalue (hSA : IsSelfAdjoint T)
(hT : IsCompactOperator T) (hT_ne : T β 0) :
β (ΞΌ : β) (e : E), βeβ = 1 β§ T e = (ΞΌ : π) β’ e β§ |ΞΌ| = βTβ := by
-- βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-- PROOF STRATEGY (completing the TODO in Mathlib's Rayleigh.lean):
-- βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
--
-- For a nonzero compact self-adjoint operator T:
-- 1. Either sup or inf of the Rayleigh quotient has absolute value = βTβ
-- 2. Take a maximizing (or minimizing) sequence on the unit sphere
-- 3. Use compactness to extract a convergent subsequence of T(xβ)
-- 4. Show the sequence xβ itself converges using βTxβ - ΞΌxββ β 0
-- 5. The limit is an eigenvector with eigenvalue ΞΌ where |ΞΌ| = βTβ
--
-- Key insight: For self-adjoint T and maximizing sequence xβ with βͺTxβ,xββ« β M:
-- βTxβ - MΒ·xββΒ² = βTxββΒ² - 2MβͺTxβ,xββ« + MΒ²
-- = βͺTΒ²xβ,xββ« - 2MβͺTxβ,xββ« + MΒ² β 0
-- because βͺTΒ²xβ,xββ« β€ βTβΒ·βͺTxβ,xββ« β βTβΒ·M and M = βTβ or M = -βTβ.
-- βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-- We use hasEigenvector_of_isMaxOn/isMinOn from Rayleigh.lean
-- The key is to show the Rayleigh quotient attains its extremum for compact T
-- Step 1: Define the Rayleigh quotient supremum and infimum
let M_sup := β¨ x : { x : E // x β 0 }, T.rayleighQuotient x
let M_inf := β¨
x : { x : E // x β 0 }, T.rayleighQuotient x
-- Step 2: For self-adjoint T, βTβ = max(|M_sup|, |M_inf|)
-- So either |M_sup| = βTβ or |M_inf| = βTβ
-- Step 3: The closure of T applied to the unit sphere is compact
have hK : IsCompact (closure (T '' Metric.sphere (0 : E) 1)) := by
have h1 : Metric.sphere (0 : E) 1 β Metric.closedBall 0 1 := Metric.sphere_subset_closedBall
have h2 : T '' Metric.sphere (0 : E) 1 β T '' Metric.closedBall 0 1 := Set.image_mono h1
have h3 : closure (T '' Metric.sphere (0 : E) 1) β closure (T '' Metric.closedBall 0 1) :=
closure_mono h2
exact (hT.isCompact_closure_image_closedBall 1).of_isClosed_subset isClosed_closure h3
-- Step 4: Get a maximizing sequence and use compactness
-- Since T β 0, the sphere is nonempty and there exist nonzero vectors
have hE_nontrivial : Nontrivial E := by
by_contra h
have : Subsingleton E := not_nontrivial_iff_subsingleton.mp h
have hT_eq : T = 0 := by ext x; simp [Subsingleton.eq_zero x]
exact hT_ne hT_eq
obtain β¨vβ, hvββ© : β vβ : E, vβ β 0 := exists_ne 0
have hSphere_nonempty : (Metric.sphere (0 : E) 1).Nonempty := by
use (βvβββ»ΒΉ : π) β’ vβ
simp only [Metric.mem_sphere, dist_zero_right, norm_smul]
rw [norm_inv, RCLike.norm_coe_norm, inv_mul_cancelβ]
exact norm_ne_zero_iff.mpr hvβ
-- The proof proceeds by finding where the Rayleigh quotient achieves its maximum or minimum
-- Using sequential compactness from the compact operator
-- For the full proof, we need to:
-- 1. Take a sequence xβ in the unit sphere with βͺTxβ, xββ« approaching sup (or inf)
-- 2. Extract a convergent subsequence of Txβ using compactness
-- 3. Show xβ converges using the self-adjoint property
-- 4. Apply hasEigenvector_of_isMaxOn or hasEigenvector_of_isMinOn
-- The Rayleigh quotient is continuous
have hRayleigh_cont : Continuous T.reApplyInnerSelf := T.reApplyInnerSelf_continuous
-- For compact operators, we use sequential compactness
-- The image of unit sphere under T is relatively compact
-- Key technical step: Show that a maximizing sequence converges
-- This requires showing βTxβ - ΞΌxββ β 0 for the limiting eigenvalue ΞΌ
-- We show existence by cases: either the sup or the inf gives us βTβ
-- Case analysis: we consider whether sup β₯ |inf| or inf β€ -|sup|
-- In either case, we get an eigenvalue with absolute value βTβ
-- The detailed proof requires careful analysis of the maximizing sequence.
-- Here we provide the complete argument:
-- First, note that for any unit vector x: |βͺTx, xβ«| β€ βTβ
have hRayleigh_bound : β x : E, βxβ = 1 β |RCLike.re βͺT x, xβ«| β€ βTβ := by
intro x hx
have h1 : ββͺT x, xβ«β β€ βT xβ * βxβ := norm_inner_le_norm (T x) x
have h2 : βT xβ β€ βTβ * βxβ := T.le_opNorm x
calc |RCLike.re βͺT x, xβ«| β€ ββͺT x, xβ«β := RCLike.abs_re_le_norm _
_ β€ βT xβ * βxβ := h1
_ β€ βTβ * βxβ * βxβ := by nlinarith [norm_nonneg x, norm_nonneg (T x)]
_ = βTβ := by rw [hx]; ring
-- Local spectral-norm characterization for self-adjoint operators:
-- the operator norm equals the supremum of the absolute Rayleigh quotient
-- over the unit sphere.
-- This is a standard result in functional analysis (see e.g., Reed-Simon Vol 1).
have hOpNorm_eq_sSup_absRayleigh
(hSA : IsSelfAdjoint T) :
sSup (Set.range (fun x : {x : E // βxβ = 1} β¦ |RCLike.re βͺT x, xβ«|)) = βTβ := by
classical
let S := Set.range (fun x : {x : E // βxβ = 1} β¦ |RCLike.re βͺT x, xβ«|)
-- Upper bound: |βͺTx, xβ«| β€ βTβ for unit vectors
have hS_upper : β r β S, r β€ βTβ := by
rintro r β¨x, rflβ©
calc |RCLike.re βͺT x, xβ«| β€ ββͺT x, xβ«β := RCLike.abs_re_le_norm _
_ β€ βT (x : E)β * β(x : E)β := norm_inner_le_norm _ _
_ β€ βTβ * β(x : E)β * β(x : E)β := by gcongr; exact T.le_opNorm _
_ = βTβ := by rw [x.property]; ring
have hS_bdd : BddAbove S := β¨βTβ, hS_upperβ©
have hS_nonneg : β r β S, 0 β€ r := by
rintro r β¨x, rflβ©
exact abs_nonneg _
-- Nonemptiness: we need a unit vector to show S is nonempty
have hS_nonempty : S.Nonempty := by
obtain β¨v, hvβ© := hSphere_nonempty
have hv_norm : βvβ = 1 := mem_sphere_zero_iff_norm.mp hv
let u : {x : E // βxβ = 1} := β¨v, hv_normβ©
refine β¨|RCLike.re βͺT u, uβ«|, u, rflβ©
apply le_antisymm
Β· exact csSup_le hS_nonempty hS_upper
-- Lower bound: βTβ β€ sSup S using the Polarization Identity
-- For self-adjoint T: 4 ReβͺTx, yβ« = ReβͺT(x+y), x+yβ« - ReβͺT(x-y), x-yβ«
-- Combined with the parallelogram law, this gives |ReβͺTu, vβ«| β€ M for unit u, v
Β· let M := sSup S
-- Step 1: M is an upper bound for |ReβͺTu, uβ«| when βuβ = 1
have hM_bound : β z : E, βzβ = 1 β |RCLike.re βͺT z, zβ«| β€ M := fun z hz β¦
le_csSup hS_bdd β¨β¨z, hzβ©, rflβ©
-- Step 2: Extend to |ReβͺTv, vβ«| β€ M * βvβΒ² for all v
have h_bound_all : β v : E, |RCLike.re βͺT v, vβ«| β€ M * βvβ^2 := by
intro v
by_cases hv : v = 0
Β· simp [hv]
have hv_pos : 0 < βvβ := norm_pos_iff.mpr hv
let u := (ββvββ»ΒΉ : π) β’ v
have hu : βuβ = 1 := by
simp only [u, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hv_pos)]
exact inv_mul_cancelβ (ne_of_gt hv_pos)
have h_le := hM_bound u hu
-- v = βvβ β’ u, so βͺTv, vβ« = βvβΒ² * βͺTu, uβ«
have hv_eq : v = (ββvβ : π) β’ u := by
simp only [u, smul_smul, RCLike.ofReal_inv]
rw [mul_inv_cancelβ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hv_pos)), one_smul]
have h_inner_scale : βͺT v, vβ« = (β(βvβ^2) : π) * βͺT u, uβ« := by
conv_lhs => rw [hv_eq]
simp only [map_smul, inner_smul_left, inner_smul_right, RCLike.conj_ofReal]
rw [sq, RCLike.ofReal_mul]; ring
rw [h_inner_scale]
simp only [RCLike.mul_re, RCLike.ofReal_re, RCLike.ofReal_im, zero_mul, sub_zero]
rw [abs_mul, abs_of_pos (sq_pos_of_pos hv_pos)]
calc βvβ^2 * |RCLike.re βͺT u, uβ«| β€ βvβ^2 * M :=
mul_le_mul_of_nonneg_left h_le (sq_nonneg _)
_ = M * βvβ^2 := mul_comm _ _
-- Step 3: Self-adjoint symmetry: ReβͺTx, yβ« = ReβͺTy, xβ«
have hSA_symm : β a b : E, RCLike.re βͺT a, bβ« = RCLike.re βͺT b, aβ« := by
intro a b
-- For self-adjoint T: βͺTa, bβ« = βͺa, Tbβ« = conj βͺTb, aβ«
-- Taking Re of both sides: ReβͺTa, bβ« = Re(conj βͺTb, aβ«) = ReβͺTb, aβ«
have h1 : βͺT a, bβ« = βͺa, T bβ« := by
-- adjoint_inner_right: βͺb, (adjoint T) aβ« = βͺT b, aβ«
have h := ContinuousLinearMap.adjoint_inner_right T b a
rw [hSA.adjoint_eq] at h
-- h : βͺb, T aβ« = βͺT b, aβ«
-- We need: βͺT a, bβ« = βͺa, T bβ«
-- inner_conj_symm x y : conj βͺy, xβ« = βͺx, yβ«
-- So inner_conj_symm (T a) b : conj βͺb, T aβ« = βͺT a, bβ«
-- Thus (inner_conj_symm (T a) b).symm : βͺT a, bβ« = conj βͺb, T aβ«
have h2 : βͺT a, bβ« = starRingEnd π βͺb, T aβ« := (inner_conj_symm (T a) b).symm
have h3 : βͺa, T bβ« = starRingEnd π βͺT b, aβ« := (inner_conj_symm a (T b)).symm
rw [h2, h, h3]
calc RCLike.re βͺT a, bβ« = RCLike.re βͺa, T bβ« := by rw [h1]
_ = RCLike.re (starRingEnd π βͺT b, aβ«) := by rw [inner_conj_symm]
_ = RCLike.re βͺT b, aβ« := RCLike.conj_re _
-- Step 4: Polarization identity for self-adjoint T
have h_polarization : β x y : E,
4 * RCLike.re βͺT x, yβ« = RCLike.re βͺT (x+y), x+yβ« - RCLike.re βͺT (x-y), x-yβ« := by
intro x y
simp only [map_add, map_sub, inner_add_left, inner_add_right,
inner_sub_left, inner_sub_right, map_add, map_sub]
have h_sym := hSA_symm x y
linarith
Levi G (Dec 12 2025 at 22:56):
Β Β Β -- Step 5: For unit vectors u, v: |ReβͺTu, vβ«| β€ M
Β Β Β have h_unit_bound : β u v : E, βuβ = 1 β βvβ = 1 β |RCLike.re βͺT u, vβ«| β€ M := by
Β Β Β Β intro u v hu hv
Β Β Β Β have h_pol := h_polarization u v
Β Β Β Β -- Parallelogram law: βu+vβΒ² + βu-vβΒ² = 2(βuβΒ² + βvβΒ²)
Β Β Β Β have h_par : βu + vβ * βu + vβ + βu - vβ * βu - vβ = 2 * (βuβ * βuβ + βvβ * βvβ) :=
Β Β Β Β Β parallelogram_law_with_norm π u v
Β Β Β Β rw [hu, hv] at h_par
Β Β Β Β simp only [one_mul] at h_par
Β Β Β Β -- Convert to ^2 form
Β Β Β Β have h_par' : βu + vβ^2 + βu - vβ^2 = 4 := by
Β Β Β Β Β simp only [sq]; linarith
Β Β Β Β -- |4 * ReβͺTu, vβ«| β€ |ReβͺT(u+v), u+vβ«| + |ReβͺT(u-v), u-vβ«|
Β Β Β Β have h_abs_le : |4 * RCLike.re βͺT u, vβ«| β€
Β Β Β Β Β Β |RCLike.re βͺT (u+v), u+vβ«| + |RCLike.re βͺT (u-v), u-vβ«| := by
Β Β Β Β Β rw [h_pol]
Β Β Β Β Β exact abs_sub _ _
Β Β Β Β calc |RCLike.re βͺT u, vβ«|
Β Β Β Β Β = |4 * RCLike.re βͺT u, vβ«| / 4 := by
Β Β Β Β Β Β Β rw [abs_mul, abs_of_pos (by norm_num : (0:β) < 4)]; ring
Β Β Β Β Β _ β€ (|RCLike.re βͺT (u+v), u+vβ«| + |RCLike.re βͺT (u-v), u-vβ«|) / 4 := by
Β Β Β Β Β Β Β apply div_le_div_of_nonneg_right h_abs_le (by norm_num : (0:β) β€ 4)
Β Β Β Β Β _ β€ (M * βu+vβ^2 + M * βu-vβ^2) / 4 := by
Β Β Β Β Β Β Β apply div_le_div_of_nonneg_right _ (by norm_num : (0:β) β€ 4)
Β Β Β Β Β Β Β exact add_le_add (h_bound_all _) (h_bound_all _)
Β Β Β Β Β _ = M * (βu+vβ^2 + βu-vβ^2) / 4 := by ring
Β Β Β Β Β _ = M * 4 / 4 := by rw [h_par']
Β Β Β Β Β _ = M := by ring
Β Β Β -- Step 6: Use opNorm_le_bound with the polarization bound
Β Β Β apply ContinuousLinearMap.opNorm_le_bound _ (Real.sSup_nonneg hS_nonneg)
Β Β Β intro x
Β Β Β by_cases hx : x = 0
Β Β Β Β· simp [hx]
Β Β Β have hx_pos : 0 < βxβ := norm_pos_iff.mpr hx
Β Β Β -- Normalize x to unit vector
Β Β Β let u := (ββxββ»ΒΉ : π) β’ x
Β Β Β have hu : βuβ = 1 := by
Β Β Β Β simp only [u, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hx_pos)]
Β Β Β Β exact inv_mul_cancelβ (ne_of_gt hx_pos)
Β Β Β by_cases hTu : T u = 0
Β Β Β Β· -- If Tu = 0, then Tx = βxβ β’ Tu = 0, so βTxβ = 0 β€ M * βxβ
Β Β Β Β have hTx : T x = (ββxβ : π) β’ T u := by
Β Β Β Β Β simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]
Β Β Β Β Β rw [mul_inv_cancelβ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]
Β Β Β Β simp only [hTx, hTu, smul_zero, norm_zero]
Β Β Β Β exact mul_nonneg (Real.sSup_nonneg hS_nonneg) (norm_nonneg _)
Β Β Β -- Use v = Tu / βTuβ as the test vector
Β Β Β have hTu_pos : 0 < βT uβ := norm_pos_iff.mpr hTu
Β Β Β let v := (ββT uββ»ΒΉ : π) β’ T u
Β Β Β have hv : βvβ = 1 := by
Β Β Β Β simp only [v, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hTu_pos)]
Β Β Β Β exact inv_mul_cancelβ (ne_of_gt hTu_pos)
Β Β Β -- βͺTu, vβ« = βͺTu, Tu/βTuββ« = βTuβ
Β Β Β have h_inner : RCLike.re βͺT u, vβ« = βT uβ := by
Β Β Β Β simp only [v, inner_smul_right]
Β Β Β Β rw [inner_self_eq_norm_sq_to_K]
Β Β Β Β simp only [RCLike.mul_re, RCLike.ofReal_re, RCLike.ofReal_im, zero_mul, sub_zero]
Β Β Β Β have h_re : RCLike.re ((ββT uβ : π) ^ 2) = βT uβ^2 := by
Β Β Β Β Β rw [β RCLike.ofReal_pow]; exact RCLike.ofReal_re _
Β Β Β Β rw [h_re]
Β Β Β Β field_simp [ne_of_gt hTu_pos]
Β Β Β have h_Tu_le : βT uβ β€ M := by
Β Β Β Β have := h_unit_bound u v hu hv
Β Β Β Β rw [h_inner, abs_of_pos hTu_pos] at this
Β Β Β Β exact this
Β Β Β -- Tx = βxβ β’ Tu, so βTxβ = βxβ * βTuβ β€ βxβ * M = M * βxβ
Β Β Β have hTx : T x = (ββxβ : π) β’ T u := by
Β Β Β Β simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]
Β Β Β Β rw [mul_inv_cancelβ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]
Β Β Β calc βT xβ = β(ββxβ : π) β’ T uβ := by rw [hTx]
Β Β Β Β _ = β(ββxβ : π)β * βT uβ := norm_smul _ _
Β Β Β Β _ = βxβ * βT uβ := by simp only [RCLike.norm_ofReal, abs_of_pos hx_pos]
Β Β Β Β _ β€ βxβ * M := by nlinarith [h_Tu_le, hx_pos]
Β Β Β Β _ = M * βxβ := mul_comm _ _ -- Step 5: For unit vectors u, v: |ReβͺTu, vβ«| β€ M
have h_unit_bound : β u v : E, βuβ = 1 β βvβ = 1 β |RCLike.re βͺT u, vβ«| β€ M := by
intro u v hu hv
have h_pol := h_polarization u v
-- Parallelogram law: βu+vβΒ² + βu-vβΒ² = 2(βuβΒ² + βvβΒ²)
have h_par : βu + vβ * βu + vβ + βu - vβ * βu - vβ = 2 * (βuβ * βuβ + βvβ * βvβ) :=
parallelogram_law_with_norm π u v
rw [hu, hv] at h_par
simp only [one_mul] at h_par
-- Convert to ^2 form
have h_par' : βu + vβ^2 + βu - vβ^2 = 4 := by
simp only [sq]; linarith
-- |4 * ReβͺTu, vβ«| β€ |ReβͺT(u+v), u+vβ«| + |ReβͺT(u-v), u-vβ«|
have h_abs_le : |4 * RCLike.re βͺT u, vβ«| β€
|RCLike.re βͺT (u+v), u+vβ«| + |RCLike.re βͺT (u-v), u-vβ«| := by
rw [h_pol]
exact abs_sub _ _
calc |RCLike.re βͺT u, vβ«|
= |4 * RCLike.re βͺT u, vβ«| / 4 := by
rw [abs_mul, abs_of_pos (by norm_num : (0:β) < 4)]; ring
_ β€ (|RCLike.re βͺT (u+v), u+vβ«| + |RCLike.re βͺT (u-v), u-vβ«|) / 4 := by
apply div_le_div_of_nonneg_right h_abs_le (by norm_num : (0:β) β€ 4)
_ β€ (M * βu+vβ^2 + M * βu-vβ^2) / 4 := by
apply div_le_div_of_nonneg_right _ (by norm_num : (0:β) β€ 4)
exact add_le_add (h_bound_all _) (h_bound_all _)
_ = M * (βu+vβ^2 + βu-vβ^2) / 4 := by ring
_ = M * 4 / 4 := by rw [h_par']
_ = M := by ring
-- Step 6: Use opNorm_le_bound with the polarization bound
apply ContinuousLinearMap.opNorm_le_bound _ (Real.sSup_nonneg hS_nonneg)
intro x
by_cases hx : x = 0
Β· simp [hx]
have hx_pos : 0 < βxβ := norm_pos_iff.mpr hx
-- Normalize x to unit vector
let u := (ββxββ»ΒΉ : π) β’ x
have hu : βuβ = 1 := by
simp only [u, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hx_pos)]
exact inv_mul_cancelβ (ne_of_gt hx_pos)
by_cases hTu : T u = 0
Β· -- If Tu = 0, then Tx = βxβ β’ Tu = 0, so βTxβ = 0 β€ M * βxβ
have hTx : T x = (ββxβ : π) β’ T u := by
simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]
rw [mul_inv_cancelβ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]
simp only [hTx, hTu, smul_zero, norm_zero]
exact mul_nonneg (Real.sSup_nonneg hS_nonneg) (norm_nonneg _)
-- Use v = Tu / βTuβ as the test vector
have hTu_pos : 0 < βT uβ := norm_pos_iff.mpr hTu
let v := (ββT uββ»ΒΉ : π) β’ T u
have hv : βvβ = 1 := by
simp only [v, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hTu_pos)]
exact inv_mul_cancelβ (ne_of_gt hTu_pos)
-- βͺTu, vβ« = βͺTu, Tu/βTuββ« = βTuβ
have h_inner : RCLike.re βͺT u, vβ« = βT uβ := by
simp only [v, inner_smul_right]
rw [inner_self_eq_norm_sq_to_K]
simp only [RCLike.mul_re, RCLike.ofReal_re, RCLike.ofReal_im, zero_mul, sub_zero]
have h_re : RCLike.re ((ββT uβ : π) ^ 2) = βT uβ^2 := by
rw [β RCLike.ofReal_pow]; exact RCLike.ofReal_re _
rw [h_re]
field_simp [ne_of_gt hTu_pos]
have h_Tu_le : βT uβ β€ M := by
have := h_unit_bound u v hu hv
rw [h_inner, abs_of_pos hTu_pos] at this
exact this
-- Tx = βxβ β’ Tu, so βTxβ = βxβ * βTuβ β€ βxβ * M = M * βxβ
have hTx : T x = (ββxβ : π) β’ T u := by
simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]
rw [mul_inv_cancelβ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]
calc βT xβ = β(ββxβ : π) β’ T uβ := by rw [hTx]
_ = β(ββxβ : π)β * βT uβ := norm_smul _ _
_ = βxβ * βT uβ := by simp only [RCLike.norm_ofReal, abs_of_pos hx_pos]
_ β€ βxβ * M := by nlinarith [h_Tu_le, hx_pos]
_ = M * βxβ := mul_comm _ _
Levi G (Dec 12 2025 at 23:00):
-- Above we proved `hOpNorm_eq_sSup_absRayleigh`: the operator norm of a
Β -- self-adjoint operator equals the supremum of the absolute Rayleigh quotient
Β -- over the unit sphere. The proof uses the Polarization Identity:
Β -- Β 4 ReβͺTx, yβ« = ReβͺT(x+y), x+yβ« - ReβͺT(x-y), x-yβ«
Β -- Combined with the Parallelogram Law, this yields |ReβͺTu, vβ«| β€ M for
Β -- unit vectors u, v (where M = sSup S), from which βTβ β€ M follows.
Β --
Β -- Now we use this to find eigenvectors. For compact T, we use the sequential
Β -- characterization: there exists a sequence xβ with βxββ = 1 and
Β -- |βͺTxβ, xββ«| β βTβ. By compactness, Txβ has a convergent subsequence, and
Β -- the self-adjoint property gives convergence of xβ itself.
Β --
Β -- These technical steps (maximizing sequence, subsequence extraction, and
Β -- passage to the limit) are carried out explicitly below in
Β -- `exists_eigenvector_from_maximizing_seq`, rather than being delegated
Β -- directly to the Rayleigh lemmas `hasEigenvector_of_isMaxOn` and
Β -- `hasEigenvector_of_isMinOn`.
Β -- Helper lemma: from a maximizing sequence for the Rayleigh quotient,
Β -- extract an eigenvector whose eigenvalue has absolute value equal to the
Β -- limit of the Rayleigh values.
Β have exists_eigenvector_from_maximizing_seq
Β Β Β (x : β β E) (hx_norm : β n, βx nβ = 1)
Β Β Β (L : β) (hL_nonneg : 0 β€ L) (hL_eq_norm : L = βTβ)
Β Β Β (hx_lim : Tendsto (fun n β¦ |RCLike.re βͺT (x n), x nβ«|) atTop (π L)) :
Β Β Β β (ΞΌ : β) (e : E), βeβ = 1 β§ T e = (ΞΌ : π) β’ e β§ |ΞΌ| = L := by
Β Β -- βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Β Β -- PROOF STRUCTURE:
Β Β -- Step 1: Extract subsequence with constant sign using tendsto_abs_extract_subseq
Β Β -- Step 2: Extract convergent subsequence of T x using compactness
Β Β -- Step 3: Show x itself converges using tendsto_norm_sub_smul_zero
Β Β -- Step 4: Pass to limit to get eigenvector equation
Β Β -- βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Β Β -- Step 1: Extract subsequence where Re βͺT x_n, x_nβ« β Ο * L (with |Ο| = 1)
Β Β obtain β¨Οβ, Ο, hΟβ_mono, hΟ, hΟβ_limβ© :=
Β Β Β tendsto_abs_extract_subseq hL_nonneg hx_lim
Β Β -- The subsequence x β Οβ still has unit norm
Β Β have hx_Οβ_norm : β n, βx (Οβ n)β = 1 := fun n β¦ hx_norm (Οβ n)
Β Β -- The eigenvalue candidate is ΞΌ = Ο * L
Β Β let ΞΌ := Ο * L
Β Β -- |ΞΌ| = |Ο| * L = 1 * L = L (since |Ο| = 1 and L β₯ 0)
Β Β have hΞΌ_abs : |ΞΌ| = L := by
Β Β Β calc |ΞΌ| = |Ο * L| := rfl
Β Β Β Β _ = |Ο| * |L| := abs_mul Ο L
Β Β Β Β _ = 1 * |L| := by rw [hΟ]
Β Β Β Β _ = |L| := one_mul _
Β Β Β Β _ = L := abs_of_nonneg hL_nonneg
Β Β -- Step 2: Use compactness to extract convergent subsequence of T (x (Οβ n))
Β Β -- The sequence T (x (Οβ n)) lies in closure (T '' sphere 0 1) which is compact
Β Β have hT_in_compact : β n, T (x (Οβ n)) β closure (T '' Metric.sphere (0 : E) 1) := by
Β Β Β intro n
Β Β Β apply subset_closure
Β Β Β exact β¨x (Οβ n), Metric.mem_sphere.mpr (by simp [hx_Οβ_norm n]), rflβ©
Β Β obtain β¨y, hy_mem, Οβ, hΟβ_mono, hTx_limβ© := hK.tendsto_subseq hT_in_compact
Β Β -- Combined subsequence
Β Β let Ο := Οβ β Οβ
Β Β have hΟ_mono : StrictMono Ο := hΟβ_mono.comp hΟβ_mono
Β Β -- Properties of the combined subsequence
Β Β have hx_Ο_norm : β n, βx (Ο n)β = 1 := fun n β¦ hx_norm (Ο n)
Β Β have hTx_Ο_lim : Tendsto (fun n β¦ T (x (Ο n))) atTop (π y) := hTx_lim
Β Β have hRayleigh_Ο_lim : Tendsto (fun n β¦ RCLike.re βͺT (x (Ο n)), x (Ο n)β«) atTop (π ΞΌ) :=
Β Β Β hΟβ_lim.comp hΟβ_mono.tendsto_atTop
Β Β -- Step 3: Show βT x_Ο(n) - ΞΌ x_Ο(n)β β 0 using tendsto_norm_sub_smul_zero
Β Β have hΞΌ_eq_norm : |ΞΌ| = βTβ := by rw [hΞΌ_abs, hL_eq_norm]
Β Β have h_almost_eig : Tendsto (fun n β¦ βT (x (Ο n)) - (ΞΌ : π) β’ x (Ο n)β) atTop (π 0) :=
Β Β Β IsSelfAdjoint.tendsto_norm_sub_smul_zero (T := T) hSA hx_Ο_norm hΞΌ_eq_norm hRayleigh_Ο_lim
Β Β -- Step 4: Since T x_Ο(n) β y and βT x_Ο(n) - ΞΌ x_Ο(n)β β 0, we have ΞΌ x_Ο(n) β y
Β Β -- So x_Ο(n) β y/ΞΌ (if ΞΌ β 0)
Β Β have h_smul_lim : Tendsto (fun n β¦ (ΞΌ : π) β’ x (Ο n)) atTop (π y) := by
Β Β Β -- T x_Ο(n) - ΞΌ x_Ο(n) β 0, so ΞΌ x_Ο(n) β T x_Ο(n) β y
Β Β Β have h1 : Tendsto (fun n β¦ T (x (Ο n)) - (ΞΌ : π) β’ x (Ο n)) atTop (π 0) :=
Β Β Β Β tendsto_zero_iff_norm_tendsto_zero.mpr h_almost_eig
Β Β Β have h2 := hTx_Ο_lim.sub h1
Β Β Β simp only [sub_zero] at h2
Β Β Β -- h2 : Tendsto (fun n β¦ T (x (Ο n)) - (T (x (Ο n)) - ΞΌ β’ x (Ο n))) atTop (π y)
Β Β Β -- which simplifies to Tendsto (fun n β¦ ΞΌ β’ x (Ο n)) atTop (π y)
Β Β Β convert h2 using 1
Β Β Β ext n
Β Β Β simp only [sub_sub_cancel]
Β Β -- Case split: L = 0 or L β 0
Β Β by_cases hL_zero : L = 0
Β Β Β· -- If L = 0, then ΞΌ = Ο * 0 = 0 and T = 0 (since βTβ = L = 0)
Β Β Β -- But T β 0, contradiction
Β Β Β exfalso
Β Β Β have hT_zero : βTβ = 0 := hL_eq_norm βΈ hL_zero
Β Β Β rw [opNorm_zero_iff] at hT_zero
Β Β Β exact hT_ne hT_zero
Β Β Β· -- If L β 0, then ΞΌ β 0 and we can divide
Β Β Β have hΞΌ_ne : (ΞΌ : π) β 0 := by
Β Β Β Β simp only [ne_eq, RCLike.ofReal_eq_zero]
Β Β Β Β intro hΞΌ_zero
Β Β Β Β have : |ΞΌ| = 0 := abs_eq_zero.mpr hΞΌ_zero
Β Β Β Β rw [hΞΌ_abs] at this
Β Β Β Β exact hL_zero this
Β Β Β -- x_Ο(n) = (1/ΞΌ) β’ (ΞΌ β’ x_Ο(n)) β (1/ΞΌ) β’ y
Β Β Β have hx_lim : Tendsto (fun n β¦ x (Ο n)) atTop (π ((ΞΌ : π)β»ΒΉ β’ y)) := by
Β Β Β Β have h := h_smul_lim.const_smul (ΞΌ : π)β»ΒΉ
Β Β Β Β simp only [smul_smul, inv_mul_cancelβ hΞΌ_ne, one_smul] at h
Β Β Β Β exact h
Β Β Β -- Let e = (1/ΞΌ) β’ y
Β Β Β let e := (ΞΌ : π)β»ΒΉ β’ y
Β Β Β -- e has unit norm by limit_of_unit_seq_is_unit
Β Β Β have he_norm : βeβ = 1 := limit_of_unit_seq_is_unit hx_Ο_norm hx_lim
Β Β Β -- T e = ΞΌ e: pass to the limit in T x_Ο(n) = ΞΌ x_Ο(n) + error
Β Β Β have hTe : T e = (ΞΌ : π) β’ e := by
Β Β Β Β -- T is continuous, so T x_Ο(n) β T e
Β Β Β Β have h_T_cont := T.continuous.tendsto e
Β Β Β Β have hT_of_lim : Tendsto (fun n β¦ T (x (Ο n))) atTop (π (T e)) :=
Β Β Β Β Β h_T_cont.comp hx_lim
Β Β Β Β -- But we also have T x_Ο(n) β y, so T e = y
Β Β Β Β have hTe_eq_y : T e = y := tendsto_nhds_unique hT_of_lim hTx_Ο_lim
Β Β Β Β -- And ΞΌ e = ΞΌ β’ (1/ΞΌ β’ y) = y
Β Β Β Β have hΞΌe_eq_y : (ΞΌ : π) β’ e = y := by
Β Β Β Β Β calc (ΞΌ : π) β’ e = (ΞΌ : π) β’ ((ΞΌ : π)β»ΒΉ β’ y) := rfl
Β Β Β Β Β Β _ = ((ΞΌ : π) * (ΞΌ : π)β»ΒΉ) β’ y := by rw [smul_smul]
Β Β Β Β Β Β _ = (1 : π) β’ y := by rw [mul_inv_cancelβ hΞΌ_ne]
Β Β Β Β Β Β _ = y := one_smul _ _
Β Β Β Β rw [hTe_eq_y, hΞΌe_eq_y]
Β Β Β exact β¨ΞΌ, e, he_norm, hTe, hΞΌ_absβ©
Β -- Lemma C: existence of an eigenvector with eigenvalue of maximal absolute value.
Β -- We need to construct a maximizing sequence where |Re βͺT x_n, x_nβ«| β βTβ
Β have h_limit_exists : β (ΞΌ : β) (e : E), βeβ = 1 β§ T e = (ΞΌ : π) β’ e β§ |ΞΌ| = βTβ := by
Β Β -- For self-adjoint operators, βTβ = sup_{βxβ=1} |Re βͺT x, xβ«|
Β Β -- We can construct a sequence approaching this supremum
Β Β -- The set of Rayleigh quotient values is bounded above by βTβ
Β Β let S : Set β := Set.range (fun x : {x : E // βxβ = 1} β¦ |RCLike.re βͺT x, xβ«|)
Β Β have hS_nonempty : S.Nonempty := by
Β Β Β obtain β¨v, hvβ© := hSphere_nonempty
Β Β Β have hv_norm : βvβ = 1 := by
Β Β Β Β rw [Metric.mem_sphere, dist_zero_right] at hv
Β Β Β Β exact hv
Β Β Β exact β¨|RCLike.re βͺT v, vβ«|, β¨β¨v, hv_normβ©, rflβ©β©
Β Β have hS_bdd : BddAbove S := by
Β Β Β use βTβ
Β Β Β intro r β¨x, hxβ©
Β Β Β rw [β hx]
Β Β Β exact hRayleigh_bound x x.property
Β Β -- There exists a sequence approaching sSup S
Β Β obtain β¨u, hu_mono, hu_tendsto, hu_memβ© := exists_seq_tendsto_sSup hS_nonempty hS_bdd
Β Β -- Extract the actual unit vectors from the sequence
Β Β have h_unit_seq : β (x : β β E), (β n, βx nβ = 1) β§
Β Β Β Β Tendsto (fun n β¦ |RCLike.re βͺT (x n), x nβ«|) atTop (π (sSup S)) := by
Β Β Β choose f hf using hu_mem
Β Β Β use fun n β¦ (f n : E)
Β Β Β constructor
Β Β Β Β· intro n; exact (f n).property
Β Β Β Β· convert hu_tendsto using 1
Β Β Β Β ext n
Β Β Β Β exact hf n
Β Β obtain β¨x, hx_norm, hx_limβ© := h_unit_seq
Β Β -- The key fact: sSup S = βTβ for self-adjoint operators
Β Β -- This is the "Rayleigh quotient characterizes the norm" result
Β Β have hsSup_eq_norm : sSup S = βTβ := by
Β Β Β -- This is exactly the content of the local spectral-norm lemma above.
Β Β Β -- The definition of S matches the set used there.
Β Β Β have h := hOpNorm_eq_sSup_absRayleigh hSA
Β Β Β simpa [S] using h
Β Β -- Now apply the helper lemma with L = βTβ
Β Β have hL_nonneg : 0 β€ βTβ := norm_nonneg T
Β Β rw [hsSup_eq_norm] at hx_lim
Β Β exact exists_eigenvector_from_maximizing_seq x hx_norm βTβ hL_nonneg rfl hx_lim
Β exact h_limit_exists
Aaron Liu (Dec 13 2025 at 02:18):
This feels like way too long for a mathlib proof
Levi G (Dec 13 2025 at 07:02):
Aaron Liu said:
This feels like way too long for a mathlib proof
Thank you for your time spent looking at my proof for rayleigh quotient theorem for self-adjoint compact operators. Thank you for your concern. I get that it is too long for a mathlib proof however it plays a critical role in my spectral theorem proof. It does the heavy lifting in my spectral theorem where I use it as a lemma in proof by induction to get a base case then for the induction step in combination with the axiom of dependent choice to get a sequence of eigenvectors and eigenvalues.
Yan Yablonovskiy πΊπ¦ (Dec 13 2025 at 08:43):
Levi G said:
Aaron Liu said:
This feels like way too long for a mathlib proof
Thank you for your time spent looking at my proof for rayleigh quotient theorem for self-adjoint compact operators. Thank you for your concern. I get that it is too long for a mathlib proof however it plays a critical role in my spectral theorem proof. It does the heavy lifting in my spectral theorem where I use it as a lemma in proof by induction to get a base case then for the induction step in combination with the axiom of dependent choice to get a sequence of eigenvectors and eigenvalues.
I think what everyone might like, is for you to at least try 'golf' and split up the proof and shorten it massively before sharing (which requires some manual effort and lean comprehension). You can look at mathlib conventions and current code adjacent to the theory.
It is a high cognitive load to review ungolfed and unidiomatic code, if you really want someone to help you and since its volunteering time, you gotta put in a bit more work to make it easier.
Levi G (Dec 13 2025 at 09:25):
@Yan Yablonovskiy πΊπ¦ Noted. Thank for you the feedback.
Kevin Buzzard (Dec 13 2025 at 13:11):
Alternatively, given that someone else is also working on this and they are writing idiomatic code because they know something about Lean, it might be best just to wait for their work (as has been pointed out already) rather than just continuing to generate unidiomatic AI code (which kind of feels like a waste of time to me, given the situation, as it will never be reviewed or merged).
Niels Voss (Dec 14 2025 at 01:36):
Sorry that my collaborators and I have not had much time to work on this recently; we are currently studying for finals and trying to complete another project by January. We are however willing to discuss design decisions in the meantime. The current status of the PR is that #32126 defines the singular values; however I will not personally consider this PR complete until we have proven that for finite-dimensional inner product spaces the singular values as defined by the approximation numbers are equal to the singular values as defined by the square roots of eigenvalues of T*T. We need to prove this before completing the PR because otherwise we risk the definition being incorrect.
We haven't had time to look too much into it, but we currently believe that the proper progression to this proof is the Von Neumann minimax theorem (#31560) implies the min-max theorem which implies the min-max principle for singular values which in turn implies the theorem we want.
Also, I think Kim Morrison's comment above makes sense:
I love the plan, but I hope that if there is not an implementation plan (i.e. a person saying they are doing it in a time frame) that this would not get in the way of someone contributing an implementation for matrices, which there is demand for and has been for a while. I think it should suffice in an implementation for matrices to include a module doc link to this thread, giving a pointer to the proposed upgrade path.
So I don't want to discourage people from working on the matrix SVD. However, I have very specific ideas for how the matrix SVD should be set up to maximize generality (see for instance my comment about merging the Full, Thin, Compact, and Truncated SVDs into a single definition). In particular, SVD has an above-average difficulty of even stating the theorem statement idiomatically in Lean, and there have been previous SVD formalizations which have been rejected simply due to unidiomatic definitions (e.g. #6042). Translating a proof that uses unidiomatic definitions to a proof that uses idiomatic definitions can actually be harder than starting from idiomatic definitions in the first place, which means that a proof with unidiomatic definitions might not count as any progress at all towards getting something into mathlib.
So I would request that anyone interested in working on the matrix SVD and getting it into mathlib discuss the theorem statement they intend to formalize along with all new definitions before attempting to work on the proof.
Levi G (Dec 14 2025 at 15:34):
Jireh Loreaux said:
Levi G the fact that it's a big file is part of the problem. At this point, you don't have experience contributing to Mathlib, so a contribution of this magnitude is very hard to review. If you start with smaller projects and learn good design patterns from receiving review, then it gets easier because your code is higher quality. Let me highlight one example as evidence:
theorem IsCompactOperator.eigenvalue_tendsto_zero (hT : IsCompactOperator T) (_hSA : IsSelfAdjoint T) {ΞΌ : β β β} {e : β β E} (he_orth : Orthonormal π e) (he_eigen : β n, T (e n) = (ΞΌ n : π) β’ (e n)) (hΞΌ_antitone : Antitone (fun n β¦ |ΞΌ n|)) : Tendsto (fun n β¦ |ΞΌ n|) atTop (π 0) := byThis theorem has nothing to do with selfadjiont operators, or real eigenvalues, or
Antitonein any fundamental way. That is, what's really going on here is that sinceeis orthonormal, is converges weakly to zero, and compact operators take weakly convergent sequences to norm convergent ones.Again, I would love for you to become a productive member of the community, but please start with smaller projects. It may even be the case that there are peices of your code worth salvaging, but I don't have the energy or the time to try to sift through 2000 lines of code looking for a diamond in the rough.
Does this address the issue in that example? It's part of my new code that I'll post later.
import Mathlib.Analysis.InnerProductSpace.Basic
import Mathlib.Analysis.Normed.Operator.Compact
import Mathlib.Topology.Algebra.InfiniteSum.Basic
import Mathlib.Analysis.InnerProductSpace.Adjoint
/-!
# Weak Convergence and Compact Operators
This file proves that eigenvalues of compact operators corresponding to orthonormal
sequences of eigenvectors tend to zero.
## Main results
* `Orthonormal.tendsto_inner_zero`: Orthonormal sequences converge weakly to zero.
* `IsCompactOperator.tendsto_of_tendsto_weakly_zero`: Compact operators map weakly
convergent sequences to norm convergent sequences.
* `IsCompactOperator.eigenvalue_tendsto_zero`: The sequence of eigenvalues of a compact
operator on a Hilbert space tends to zero (for orthonormal eigenvectors).
-/
open Filter Topology
variable {π : Type*} [RCLike π]
variable {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π E]
local notation "βͺ" x ", " y "β«" => @inner π E _ x y
/-- **Orthonormal Sequences Converge Weakly to Zero** (Bessel's Inequality Consequence)
For any orthonormal sequence `(eβ)` and any vector `x`, we have `βͺx, eββ« β 0` as `n β β`.
This follows from Bessel's inequality: `β |βͺx, eββ«|Β² β€ βxβΒ²`, which implies the terms
of the sum must tend to zero.
-/
theorem Orthonormal.tendsto_inner_zero {e : β β E} (he : Orthonormal π e) (x : E) :
Tendsto (fun n β¦ βͺx, e nβ«) atTop (π 0) := by
have h_summable : Summable (fun n β¦ ββͺe n, xβ«β^2) := he.inner_products_summable x
have h_sq_tendsto := h_summable.tendsto_atTop_zero
have h_norm_tendsto : Tendsto (fun n β¦ ββͺe n, xβ«β) atTop (π 0) := by
have h := h_sq_tendsto.sqrt
simp only [Real.sqrt_sq (norm_nonneg _), Real.sqrt_zero] at h
exact h
have h_norm_swap : (fun n β¦ ββͺx, e nβ«β) = (fun n β¦ ββͺe n, xβ«β) := by
funext n
simp [norm_inner_symm]
have h_norm_tendsto' : Tendsto (fun n β¦ ββͺx, e nβ«β) atTop (π 0) :=
h_norm_swap βΈ h_norm_tendsto
exact tendsto_zero_iff_norm_tendsto_zero.mpr h_norm_tendsto'
variable [CompleteSpace E]
/-- **Compact Operators are Weak-to-Norm Continuous on Bounded Sets**
If `T` is a compact operator and `(xβ)` is a bounded sequence converging weakly to `0`,
then `T(xβ) β 0` in norm.
-/
theorem IsCompactOperator.tendsto_of_tendsto_weakly_zero
{T : E βL[π] E} (hT : IsCompactOperator T) {x : β β E}
(hx_bdd : β M : β, β n, βx nβ β€ M)
(hx_weak : β y : E, Tendsto (fun n β¦ βͺy, x nβ«) atTop (π 0)) :
Tendsto (fun n β¦ T (x n)) atTop (π 0) := by
obtain β¨M, hMβ© := hx_bdd
rcases hT.image_closedBall_subset_compact (f := (T : E βββ[RingHom.id π] E)) (M + 1) with
β¨K, hK_compact, hTKβ©
have h_in_K : β n, T (x n) β K := fun n β¦ by
apply hTK
refine β¨x n, ?_, rflβ©
simp only [Metric.mem_closedBall, dist_zero_right]
calc βx nβ β€ M := hM n
_ β€ M + 1 := by linarith
rw [Metric.tendsto_atTop]
intro Ξ΅ hΞ΅
classical
by_contra h_not_eventually
have h_not_ev_lt : Β¬ βαΆ n in atTop, βT (x n)β < Ξ΅ := by
intro h_ev
rcases (eventually_atTop.1 h_ev) with β¨N, hNβ©
exact h_not_eventually β¨N, by intro n hn; simpa [dist_zero_right] using hN n hnβ©
have h_freq : βαΆ n in atTop, Ξ΅ β€ βT (x n)β :=
(not_eventually.1 h_not_ev_lt).mono (by intro n hn; exact le_of_not_gt hn)
obtain β¨Ο, hΟ_mono, hΟ_propβ© := extraction_of_frequently_atTop h_freq
have h_in_K' : β n, T (x (Ο n)) β K := fun n β¦ h_in_K (Ο n)
obtain β¨y, _, Ο, hΟ_mono, hTx_limβ© := hK_compact.tendsto_subseq h_in_K'
have hy_zero : y = 0 := by
have h_weak_y : β z : E, βͺz, yβ« = 0 := fun z β¦ by
have h1 : Tendsto (fun n β¦ βͺz, T (x (Ο (Ο n)))β«) atTop (π βͺz, yβ«) :=
(Filter.Tendsto.inner tendsto_const_nhds hTx_lim)
have h2 : Tendsto (fun n β¦ βͺz, T (x (Ο (Ο n)))β«) atTop (π 0) := by
have h_adj : Tendsto (fun n β¦ βͺ(ContinuousLinearMap.adjoint T) z, x (Ο (Ο n))β«) atTop (π 0) :=
(hx_weak ((ContinuousLinearMap.adjoint T) z)).comp ((hΟ_mono.comp hΟ_mono).tendsto_atTop)
simpa [ContinuousLinearMap.adjoint_inner_left] using h_adj
exact tendsto_nhds_unique h1 h2
exact inner_self_eq_zero.mp (h_weak_y y)
have h_norm_lim : Tendsto (fun n β¦ βT (x (Ο (Ο n)))β) atTop (π 0) := by
have hTx_zero : Tendsto (fun n β¦ T (x (Ο (Ο n)))) atTop (π 0) := by simpa [hy_zero] using hTx_lim
simpa [Function.comp, norm_zero] using (continuous_norm.tendsto (0 : E)).comp hTx_zero
have h_ge_Ξ΅ : β n, Ξ΅ β€ βT (x (Ο (Ο n)))β := fun n β¦ hΟ_prop (Ο n)
have h_eventually_small : βαΆ n in atTop, βT (x (Ο (Ο n)))β < Ξ΅ := by
rw [Metric.tendsto_atTop] at h_norm_lim
obtain β¨N, hNβ© := h_norm_lim Ξ΅ hΞ΅
filter_upwards [Filter.Ici_mem_atTop N] with n hn
simpa [dist_zero_right] using hN n hn
obtain β¨n, hnβ© := h_eventually_small.exists
exact (lt_of_lt_of_le hn (h_ge_Ξ΅ n)).false
/-- Eigenvalues of a compact operator decay to zero along an orthonormal eigenbasis. -/
theorem IsCompactOperator.eigenvalue_tendsto_zero
{T : E βL[π] E} (hT : IsCompactOperator T)
{ΞΌ : β β π} {e : β β E}
(he_orth : Orthonormal π e)
(he_eigen : β n, T (e n) = ΞΌ n β’ e n) :
Tendsto (fun n β¦ βΞΌ nβ) atTop (π 0) := by
have h_weak : β y : E, Tendsto (fun n β¦ βͺy, e nβ«) atTop (π 0) :=
fun y β¦ he_orth.tendsto_inner_zero y
have h_bdd : β M : β, β n, βe nβ β€ M := β¨1, fun n β¦ le_of_eq (he_orth.norm_eq_one n)β©
have h_Te_zero : Tendsto (fun n β¦ T (e n)) atTop (π 0) :=
hT.tendsto_of_tendsto_weakly_zero h_bdd h_weak
have h_norm_eq : β n, βΞΌ nβ = βT (e n)β := fun n β¦ by
rw [he_eigen n, norm_smul, he_orth.norm_eq_one, mul_one]
simpa [h_norm_eq] using (continuous_norm.tendsto (0 : E)).comp h_Te_zero
Michael Rothgang (Dec 14 2025 at 16:29):
@Levi G In the interest of using my time well, I will not be reviewing your code on this matter, for all the reasons already mentioned up-thread. (Other maintainers may or may not do the same.) Make of that what you will, but I think it's fair for you to know.
Last updated: Dec 20 2025 at 21:32 UTC