Zulip Chat Archive

Stream: Is there code for X?

Topic: Singular Value Decomposition


Niels Voss (Aug 04 2025 at 18:19):

Is there code for or a plan for singular value decomposition (of matrices or linear maps)? I found #6042 by @MohanadAhmed, but it seems that it's not general enough to be merged as-is. I was wondering if anything had replaced it yet, or if there was a plan to finish it. If not, what's the ideal generalization of SVD for mathlib (I assume there are ways to generalize SVD for finite-dimensional inner product spaces to maybe something that is not finite-dimensional)?

Jireh Loreaux (Aug 04 2025 at 23:45):

There the Schmidt decomposition for compact operators on a Hilbert space, but that's different enough that we should have both, not try to shoehorn SVD to fit that mold.

Niels Voss (Aug 05 2025 at 00:09):

Is the schmidt decomposition in mathlib, or are you just saying that it would be a candidate for a generalization of SVD?

Niels Voss (Aug 05 2025 at 00:30):

Can the schmidt decomposition be generalized further to infinite dimensions (I only saw the finite-dimensional case on wikipedia) or to larger tensor products than just two spaces, or is it as generalized as it will probably get?

Niels Voss (Aug 23 2025 at 19:01):

Would singular value decomposition of linear maps between finite-dimensional inner product spaces be on topic for mathlib? Or would it need to be generalized to the infinite-dimensional case? We only have spectral theorem for linear maps between finite-dimensional spaces

Niels Voss (Aug 23 2025 at 19:03):

I was considering something like

variable {π•œ : Type*} [RCLike π•œ]
variable {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π•œ E]
variable {F : Type*} [NormedAddCommGroup F] [InnerProductSpace π•œ F]
variable {n : β„•} (hn : Module.finrank π•œ E = n)
variable (T : E β†’β‚—[π•œ] F)
variable [FiniteDimensional π•œ E] [FiniteDimensional π•œ F]

noncomputable def LinearMap.singularValues (i : Fin n) : ℝ :=
  Real.sqrt ((LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues hn i)

(where LinearMap.isSymmetric_adjoint_comp_self comes from #Is there code for X? > T*T is positive when domain and codomain are different) This definition has the nice property of preserving multiplicities of singular values, as opposed to something like a HasSingularValue predicate, which wouldn't.

YaΓ«l Dillies (Aug 23 2025 at 19:09):

Niels Voss said:

Would singular value decomposition of linear maps between finite-dimensional inner product spaces be on topic for mathlib? Or would it need to be generalized to the infinite-dimensional case?

Would generalising to the infinite-dimensional case involve throwing away (most of) the finite-dimensional proof? If not, then mathlib will be happy to have finite-dimensional SVD.

Niels Voss (Aug 23 2025 at 19:11):

To be honest, I don't personally know how to generalize SVD to the infinite-dimensional case. Do you know what the "best" generalization is?

Niels Voss (Aug 23 2025 at 20:36):

Also, for T : E -> F, when mathematicians say "list of singular values", does this list generally

  • have length rank(T), so that all the singular values are positive
  • have length min(dim(E), dim(F))
  • have length dim(E) (as it does in LADR 4th edition)
  • have countably infinite length, but all but finitely many of the singular values are zero?

I'm trying to decide whether indexing the list of singular values by Fin n or Nat is a better idea (and if the latter, whether or not I should Finsupp)

Niels Voss (Aug 23 2025 at 20:56):

There's also this alternative formulation from https://en.wikipedia.org/wiki/Singular_value
image.png

Niels Voss (Aug 24 2025 at 00:00):

I changed it to this definition:

variable {π•œ : Type*} [RCLike π•œ]
variable {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π•œ E]
variable {F : Type*} [NormedAddCommGroup F] [InnerProductSpace π•œ F]
variable (T : E β†’β‚—[π•œ] F)
variable [FiniteDimensional π•œ E] [FiniteDimensional π•œ F]

noncomputable def LinearMap.singularValues : β„• β†’β‚€ ℝ :=
  Finsupp.embDomain Fin.valEmbedding <|
    Finsupp.ofSupportFinite
      (Real.sqrt ∘ (LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues rfl)
      (Set.toFinite _)

with the characterization

theorem LinearMap.apply_singularValues_of_lt {n : β„•} (h : n < Module.finrank π•œ E)
  : T.singularValues n
    = √((LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues rfl ⟨n, h⟩) := sorry

theorem LinearMap.apply_singularValues_of_le {n : β„•} (h : Module.finrank π•œ E ≀ n)
  : T.singularValues n = 0 := sorry

theorem LinearMap.apply_singularValues_fin {n : β„•} (hn : Module.finrank π•œ E = n) (i : Fin n)
  : T.singularValues ↑i = √((LinearMap.isSymmetric_adjoint_comp_self T).eigenvalues hn i) := sorry

SΓ©bastien GouΓ«zel (Aug 24 2025 at 08:04):

I'd rather allow infinite dimension (which means that, in general, singular values don't form a finitely supported sequence). For instance, the singular values of the Sobolev injections are interesting objects, as they quantify "how much compact" the injection is. More generally, approximation numbers (the inf of the norm of T - L for finite rank L) are very much relevant in this setting, as they make sense for Sobolev spaces not based on L^2.

Jireh Loreaux (Aug 24 2025 at 14:02):

I think defining approximation numbers would be great. The thing to prove, as far as I'm concerned, with respect to SVD would be that for a compact operator there are two orthonormal sequences such that the compact operator is the norm convergent sum of the rank-1 elementary tensors induced by pairing these vectors and multiplying by the approximation numbers.

Niels Voss (Aug 24 2025 at 15:06):

Since I don't know the actual math yet: Are approximation numbers defined for continuous linear maps only or all linear maps? And can they be infinite? Should the codomain be Real, NNReal, or ENNReal?

SΓ©bastien GouΓ«zel (Aug 24 2025 at 16:04):

It's only defined for continuous linear maps (for non-continuous linear maps, all the approximation numbers would be infinite, so they wouldn't be interesting at all). Note that in finite dimension, all linear maps are continuous, so there's no loss in generality.

Niels Voss (Aug 24 2025 at 16:06):

To confirm, for non-continuous linear maps, every approximation number is infinite because subtracting a finite-rank linear map will never result it it becoming continuous, and all non-continuous maps have infinite norm, right?

Niels Voss (Aug 24 2025 at 16:09):

Defining it only for continuous linear maps would make it slightly more annoying to use, because the conversion between linear and continuous linear maps is not automatic even for finite dimensions. This is why we have LinearMap.adjoint in addition to ContinuousLinearMap.adjoint

SΓ©bastien GouΓ«zel (Aug 24 2025 at 16:25):

I would argue that it's more natural to define it only for continuous linear maps, even in finite dimension: it's an analytical notion, depending on the norm, so naturally the realm of continuous linear maps.

Jireh Loreaux (Aug 24 2025 at 16:56):

SΓ©bastien, I think that's only half true. People will certainly want SVD for matrices without caring much about the norm / topology.

Niels Voss (Aug 24 2025 at 16:58):

If the only cost is that people have to do T.toContinuousLinearMap.singularValues instead of T.singularValues it isn't that bad

SΓ©bastien GouΓ«zel (Aug 24 2025 at 17:17):

I think that for matrices we should have a shortcut indeed. Note that matrices are not the same as linear maps: to me, one should have them for continuous linear maps because that's the right conceptual framework, and then for matrices because it's used so often.

Niels Voss (Aug 24 2025 at 17:19):

And nothing for regular linear maps, right?

Jireh Loreaux (Aug 24 2025 at 20:35):

Great, sounds like we're on the same page then.

Kim Morrison (Aug 24 2025 at 22:56):

I love the plan, but I hope that if there is not an implementation plan (i.e. a person saying they are doing it in a time frame) that this would not get in the way of someone contributing an implementation for matrices, which there is demand for and has been for a while. I think it should suffice in an implementation for matrices to include a module doc link to this thread, giving a pointer to the proposed upgrade path.

Niels Voss (Aug 26 2025 at 17:34):

Should the codomain of ContinuousLinearMap.singularValue be Real or NNReal?

Jireh Loreaux (Aug 26 2025 at 18:14):

I would be tempted to define it as β„• β†’ ℝβ‰₯0 as subtraction plays a lesser role in the context of singular values. Nonnegativity is more important I think.

Niels Voss (Aug 26 2025 at 18:55):

Does this look like a reasonable definition then?

/--
When the vector spaces are not necessarily finite, these are called the approximation numbers.
Note: we use ≀ n instead of < n because we start the singular values at 0, not 1, so if we did
< n the first singular value would be repeated.
-/
noncomputable def ContinuousLinearMap.singularValue {π•œ : Type*} [NontriviallyNormedField π•œ]
  {X : Type*} [SeminormedAddCommGroup X] [NormedSpace π•œ X]
  {Y : Type*} [SeminormedAddCommGroup Y] [NormedSpace π•œ Y]
  (T : X β†’L[π•œ] Y) (n : β„•) : ℝβ‰₯0 :=
  -- NOTE: We can't use `β¨… H ∈ {...}` because of
  -- https://leanprover-community.github.io/extras/pitfalls.html#accidental-double-iinf-or-isup
  β¨… H : {S : X β†’L[π•œ] Y // S.rank ≀ n}, β€–T - Hβ€–β‚Š

Niels Voss (Aug 28 2025 at 05:58):

What way would you recommend to define singular values for Matricies? For a matrix M, I can either say that

  • The singular values of M are the singular values of the operator T produced by converting M to a linear map between EuclideanSpace k m and EuclideanSpace k n (probably with docs#Matrix.toEuclideanLin). This basically the approach Matrix.IsHermitian.eigenvaluesβ‚€ uses. Requires M to be finite (or at least needs M to have a finite number of columns) and the field to be RCLike.
  • The singular values of M are defined by inf {||M - N|| : rank(N) < n} where we use the norm defined in Mathlib.Analysis.CStarAlgebra.Matrix. Requires M to be finite and the field to be RCLike.
  • The singular values of M are defined by the square roots of the eigenvalues of T*T, produced by docs#Matrix.IsHermitian.eigenvaluesβ‚€. Requires M to be finite and the field to be RCLike.

All of these approaches only work for finite-dimensional Real and Complex. Between these three approaches, which is the best, and is there a different approach that can be used? If only finite matrices are supported, should the definition be in Finsupp, like above?

Niels Voss (Aug 28 2025 at 06:12):

The first option would probably be something like

open NNReal in
noncomputable def Matrix.singularValue {π•œ : Type*} [RCLike π•œ]
  {n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π•œ)
  : β„• β†’ ℝβ‰₯0 := M.toEuclideanLin.toContinuousLinearMap.singularValue

or

open NNReal in
noncomputable def Matrix.singularValue {π•œ : Type*} [RCLike π•œ]
  {n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π•œ)
  : β„• β†’β‚€ ℝβ‰₯0 :=
  Finsupp.ofSupportFinite
    M.toEuclideanLin.toContinuousLinearMap.singularValue
    sorry

(I can probably get rid of the [DecidableEq m] since it is noncomputable anyways)

Niels Voss (Aug 30 2025 at 20:04):

Jireh Loreaux said:

I think defining approximation numbers would be great. The thing to prove, as far as I'm concerned, with respect to SVD would be that for a compact operator there are two orthonormal sequences such that the compact operator is the norm convergent sum of the rank-1 elementary tensors induced by pairing these vectors and multiplying by the approximation numbers.

Sorry if this is a stupid question, but since I don't know the field that well, can you confirm whether this holds true between arbitrary normed spaces, arbitrary Banach spaces, Banach spaces that satisfy the Approximation Property, or just Hilbert spaces? And would a proof for just hilbert spaces be accepted, or would it need to be as general as possible? My understanding is that the standard proof for Hilbert spaces depends on the spectral theorem for Hilbert spaces, which does not yet exist in mathlib. Also, does this sum converge conditionally, unconditionally but not absolutely, or absolutely?

Jireh Loreaux (Aug 30 2025 at 20:42):

The fact about the existence of orthonormal sequences only holds in the context of Hilbert spaces. @FrΓ©dΓ©ric Dupuis and @Heather Macbeth at one point had the spectral theorem for compact (selfadjoint? probably not normal) operators, but I think it may be Lean 3 code and porting it was never finished. It would be nice to make that available. The sum converges in operator norm.

FrΓ©dΓ©ric Dupuis (Aug 30 2025 at 21:01):

Yes, I haven't forgotten about this! :-) Indeed it's Lean 3 code, and it was written before we had compact operators in mathlib, so it's based on a less general definition that had just enough API to get through this proof. Every once in a while I tell myself I should port it, and then I get distracted by other things.

Niels Voss (Aug 30 2025 at 21:04):

In that case, is SVD most likely blocked behind the spectral theorem? I don't mind postulating the spectral theorem for the SVD so that they can be worked on in parallel.

Niels Voss (Aug 31 2025 at 18:59):

I have the following extremely rough draft of the statement of SVD for matrices (assuming singular values have already been defined for linear maps as above). The names and which parameters are implicit will certainly need to change. But I want to know if I'm at all on the right track or not.

The idea is that if we are clever, we can make the standard SVD, thin SVD, compact SVD, and maybe even truncated SVD all the same definition. Given a m x n matrix M, we decompose M into M = UΞ£V* where

  • U is m x a
  • Ξ£ is a x b
  • V* is b x n

for some a and b.

  • If a = m and b = n, then this gives the standard SVD.
  • If a = b = min(m,n), then this gives the thin SVD.
  • If a = b = rank(M), then this gives the compact SVD.
  • If a = b < rank(M), then this gives the truncated SVD.

However, I wasn't sure how to handle the fact that m,n,a,b are arbitrary, not-necessarily ordered, finite types. To do this, I postulate injective functions f : a -> Nat and g : b -> Nat, which will normally be set to Fin.val in case a and b are Fin _. An alternative option would be to postulate r : Nat and f : Fin r -> a, g : Fin r -> b.

In any case, I would like to know if this is at all a reasonable approach or if something major needs to be changed.

open NNReal in
noncomputable def Matrix.singularValue {π•œ : Type*} [RCLike π•œ]
  {n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π•œ)
  : β„• β†’ ℝβ‰₯0 :=
  M.toEuclideanLin.toContinuousLinearMap.singularValue

-- TODO: Fix names
noncomputable def Matrix.leftsing {π•œ : Type*} [RCLike π•œ]
  {m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
  (a : Type*) [Fintype a] (f : a β†’ β„•)
  (M : Matrix m n π•œ)
  : Matrix m a π•œ := sorry

noncomputable def Matrix.rightsing {π•œ : Type*} [RCLike π•œ]
  {m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
  (b : Type*) [Fintype b] (g : b β†’ β„•)
  (M : Matrix m n π•œ)
  : Matrix n b π•œ := sorry

noncomputable def Matrix.centersing {π•œ : Type*} [RCLike π•œ]
  {m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
  (M : Matrix m n π•œ)
  (a : Type*) [Fintype a] (f : a β†’ β„•) (b : Type*) (g : b β†’ β„•)
  : Matrix a b π•œ := Matrix.of
    fun i j ↦ if f i = g j then RCLike.ofReal (M.singularValue (f i)).toReal else 0

theorem Matrix.SVD {π•œ : Type*} [RCLike π•œ]
  {m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
  (a : Type*) [Fintype a] (f : a β†’ β„•) (hf : f.Injective)
  (b : Type*) [Fintype b] (g : b β†’ β„•) (hg : g.Injective)
  (M : Matrix m n π•œ)
  (hfβ‚‚ : M.singularValue.support βŠ† Set.range f) (hgβ‚‚ : M.singularValue.support βŠ† Set.range g)
  : M.leftsing a f * M.centersing a f b g * (M.rightsing b g).conjTranspose = M := sorry

Niels Voss (Aug 31 2025 at 19:14):

Also, most of the time when SVD is used in proofs, it seems to go like "Let UΞ£V* be a singular value decomposition, ...". Are there any proofs which go "Notice that UΞ£V* meets the criteria for a singular value decomposition of M, so it must satisfy property X"?

Niels Voss (Sep 02 2025 at 05:34):

@FrΓ©dΓ©ric Dupuis Is this the most recent version of your spectral theorem code? https://github.com/leanprover-community/mathlib3/tree/dupuisf/linear_map_spectral I just want to get a general sense of what the API will look like for the spectral theorem

FrΓ©dΓ©ric Dupuis (Sep 02 2025 at 12:57):

I think the latest one is https://github.com/leanprover-community/mathlib3/tree/semilinear-paper-update

Niels Voss (Sep 03 2025 at 21:41):

I've had a lot of difficulty finding what I consider "good" definitions for singular values and SVD. I think I've settled on a few definitions that I consider reasonably optimal, but would it be possible to request a review of these definitions and theorem statements before I commit a bunch of time to trying to prove them? Or would it be best to try proving them first, and then get review of the definitions during the PR review?

Ruben Van de Velde (Sep 03 2025 at 22:28):

You free to ask, but I suspect one of the questions will be "how easy is it to prove the first few dozen basic lemmas about the definition"

Li Jiale (Sep 04 2025 at 02:12):

(deleted)

Niels Voss (Sep 05 2025 at 18:50):

There is a more general version of SVD which is defined for arbitrary bounded operators between Hilbert spaces, as described in https://www.mdpi.com/2227-7390/8/8/1346. However, I don't think that mathlib has the necessary tools to prove this (in particular, we need spectral theorem for bounded self-adjoint operators between Hilbert spaces). Is it fine if I just work on the compact operator SVD for Hilbert spaces instead of trying to do the most general version?

Jireh Loreaux (Sep 05 2025 at 20:30):

  1. I think above I mentioned explicitly the compact case, so yes, go with that.
  2. The SVE mentioned in that paper I think is actually not particularly useful, which I argue is essentially because H β†’L[β„‚] H is type I as a von neumann algebra, and so what we really want is atomic (i.e., a sequence), not some measurable function.
  3. In the setting of type II von Neumann algebras there is a function ΞΌ : Set.Ici 0 β†’ ℝβ‰₯0∞ (which is finite if the von Neumann algebra is finite) which plays the same role as the singular values. In the case when the algebra is infinite, ΞΌ (just like the singular values) is only interesting in the case when the operator is "compact" (i.e., is a norm limit of operators whose ranges have finite trace).

Jireh Loreaux (Sep 05 2025 at 20:31):

@Niels Voss where are your definitions exactly?

Niels Voss (Sep 05 2025 at 21:12):

I can post the formal definitions in a second but in the meantime here's an informal sketch that I wrote up in obsidian
image.png

Jireh Loreaux (Sep 05 2025 at 21:18):

seems reasonable.

Niels Voss (Sep 05 2025 at 21:29):

Here's what I have right now for formal definitions. (I'm not sure about the isometries in the "SVD" section, the idea is that maybe they could be used to prove a more familiar version of SVD. I'm considering deleting the ContinuousLinearMap.___SingularMatrix definitions because I think they are redundant with the Matrix.___SingularMatrix definitions.)

import Mathlib

open NNReal in
noncomputable def ContinuousLinearMap.singularValue {π•œ : Type*} [NontriviallyNormedField π•œ]
  {X : Type*} [SeminormedAddCommGroup X] [NormedSpace π•œ X]
  {Y : Type*} [SeminormedAddCommGroup Y] [NormedSpace π•œ Y]
  (T : X β†’L[π•œ] Y) (n : β„•) : ℝβ‰₯0 :=
  -- NOTE: We can't use `β¨… H ∈ {...}` because of
  -- https://leanprover-community.github.io/extras/pitfalls.html#accidental-double-iinf-or-isup
  -- Also, this involves a coercion Nat β†’ Cardinal, so I might redo that part of the definition
  -- later.
  β¨… H : {S : X β†’L[π•œ] Y // S.rank ≀ n}, β€–T - Hβ€–β‚Š

open NNReal in
noncomputable def Matrix.singularValue {π•œ : Type*} [RCLike π•œ]
  {n m : Type*} [Fintype n] [Fintype m] [DecidableEq m] (M : Matrix n m π•œ)
  : β„• β†’ ℝβ‰₯0 :=
  M.toEuclideanLin.toContinuousLinearMap.singularValue

section SVD

-- TODO: When I wrote up this section for the first time, I forgot that this only works for compact
-- linear maps, and not just continuous linear maps. So the following definitions will need an
-- extra hypothesis stating T is compact, and will probably live in the IsCompactOperator namespace
-- instead of the ContinuousLinearMap namespace.

variable {π•œ : Type*} [RCLike π•œ]
  {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π•œ E] [CompleteSpace E]
  {F : Type*} [NormedAddCommGroup F] [InnerProductSpace π•œ F] [CompleteSpace F]
  (T : E β†’L[π•œ] F)

noncomputable def ContinuousLinearMap.rightSingularVector : β„• β†’ E :=
  have := T -- temporary trick to make sure that the definition includes T as a parameter
  sorry

noncomputable def ContinuousLinearMap.leftSingularVector : β„• β†’ F :=
  have := T
  sorry

-- TODO: Rename
open InnerProductSpace in
theorem ContinuousLinearMap.SVD (v : E)
  : T v = βˆ‘' n : β„•, (RCLike.ofReal ↑(T.singularValue n) : π•œ) β€’
      βŸͺv, T.rightSingularVector n⟫_π•œ β€’ T.leftSingularVector n := sorry

/--
The first `dim(E)` vectors of `ContinuousLinearMap.rightSingularVector` but bundled into an
orthonormal basis.
-/
noncomputable def ContinuousLinearMap.rightSingularOrthonormalBasis
  [FiniteDimensional π•œ E] {n : β„•} (hn : Module.finrank π•œ E = n) : OrthonormalBasis (Fin n) π•œ E :=
  OrthonormalBasis.mk (v := T.rightSingularVector ∘ Fin.val) sorry sorry

/--
The first `dim(F)` vectors of `ContinuousLinearMap.leftSingularVector` but bundled into an
orthonormal basis.
-/
noncomputable def ContinuousLinearMap.leftSingularOrthonormalBasis
  [FiniteDimensional π•œ F] {m : β„•} (hn : Module.finrank π•œ F = m) : OrthonormalBasis (Fin m) π•œ F :=
  OrthonormalBasis.mk (v := T.leftSingularVector ∘ Fin.val) sorry sorry

end SVD

section MatrixSVD

variable {π•œ : Type*} [RCLike π•œ]
  {m n : Type*} [Fintype m] [Fintype n] [DecidableEq n]
  -- a and b should be thought of as subtypes of β„•.
  {a b : Type*} [Fintype a] [Fintype b]
  (M : Matrix m n π•œ)

variable
  {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π•œ E] [CompleteSpace E]
  {F : Type*} [NormedAddCommGroup F] [InnerProductSpace π•œ F] [CompleteSpace F]
  [FiniteDimensional π•œ E] [FiniteDimensional π•œ F]
  (T : E β†’L[π•œ] F)

noncomputable def ContinuousLinearMap.rightSingularMatrix
  (rb : Module.Basis n π•œ E) (g : b β†ͺ β„•)
  : Matrix n b π•œ := Matrix.of fun i j ↦ rb.repr (T.rightSingularVector (g j)) i

noncomputable def ContinuousLinearMap.leftSingularMatrix
  (lb : Module.Basis m π•œ F) (f : a β†ͺ β„•)
  : Matrix m a π•œ := Matrix.of fun i j ↦ lb.repr (T.leftSingularVector (f j)) i

noncomputable def ContinuousLinearMap.centerSingularMatrix
  (f : a β†ͺ β„•) (g : b β†ͺ β„•)
  : Matrix a b π•œ := Matrix.of fun i j ↦
    if f i = g j then RCLike.ofReal ↑(T.singularValue (f i)) else 0

theorem ContinuousLinearMap.matrixSVD
  (lb : Module.Basis m π•œ F) (rb : Module.Basis n π•œ E) (f : a β†ͺ β„•) (g : b β†ͺ β„•)
  (hf : T.singularValue.support βŠ† Set.range f) (hg : T.singularValue.support βŠ† Set.range g)
  : T.toMatrix rb lb
    = T.leftSingularMatrix lb f * T.centerSingularMatrix f g *
      (T.rightSingularMatrix rb g).conjTranspose := sorry

noncomputable def Matrix.rightSingularMatrix
  (g : b β†ͺ β„•)
  : Matrix n b π•œ :=
  M.toEuclideanLin.toContinuousLinearMap.rightSingularMatrix (Pi.basisFun π•œ n) g

noncomputable def Matrix.leftSingularMatrix
  (f : a β†ͺ β„•)
  : Matrix m a π•œ :=
  M.toEuclideanLin.toContinuousLinearMap.leftSingularMatrix (Pi.basisFun π•œ m) f

noncomputable def Matrix.centerSingularMatrix
  (f : a β†ͺ β„•) (g : b β†ͺ β„•)
  : Matrix a b π•œ :=
  M.toEuclideanLin.toContinuousLinearMap.centerSingularMatrix f g

theorem Matrix.SVD
  (f : a β†ͺ β„•) (g : b β†ͺ β„•)
  (hf : M.singularValue.support βŠ† Set.range f) (hg : M.singularValue.support βŠ† Set.range g)
  : M = M.leftSingularMatrix f * M.centerSingularMatrix f g *
    (M.rightSingularMatrix g).conjTranspose := sorry

end MatrixSVD

Niels Voss (Sep 05 2025 at 21:33):

The motivation for the nonstandard definition of the matrix SVD is that it is actually general enough to encompass the standard svd, thin svd, compact svd, and truncated svd at the same time.

Niels Voss (Sep 05 2025 at 21:39):

Here's the way it works:

Let M be an mΓ—n matrix, and let a and b be natural numbers. Then, if rank(M) <= a and rank(M) <= b, then M = UΞ£V*, where:

  • U is mΓ—a (columns of U come from the left singular vectors f_1,f_2,...)
  • Ξ£ is aΓ—b (entries of Ξ£ come from the singular values of M)
  • V* is bΓ—n (columns of V come from the right singular vectors e_1,e_2,...)

We can recover some of the original forms of SVD as follows:

  • Full SVD: Set a = m, b = n
  • Thin SVD: Set a = b = min(m,n)
  • Compact SVD: Set a = b = rank(M)
  • Truncated SVD: Set a = b = t < rank(M), except M = UΞ£V* won’t hold.

Niels Voss (Sep 15 2025 at 16:35):

Since mathlib seems to prefer smaller PRs, could we for instance PR just the definition of the singular values and basic theorems, or would the whole SVD need to be PR'd at once?

Jireh Loreaux (Sep 15 2025 at 16:52):

Surely the PR for singular values can be on its own.

Niels Voss (Sep 15 2025 at 16:54):

Should we have a theorem which states that for the finite dimensional case, the squares of the singular values are the LinearMap.IsSymmetric.eigenvalues of T*T, or should we wait until that gets generalized to infinite dimensions?

Jireh Loreaux (Sep 15 2025 at 17:07):

I think it's fine to add the finite dimensional version, because the infinite dimensional version will need a compactness assumption anyway.

Arnav Mehta (Oct 09 2025 at 20:44):

Hey my name is Arnav I am working with @Niels Voss and we are wondering if we should get in singular values for matrices at the same time as we PR for singular values for linear maps, and also where in Mathlib folder structure should this file be placed

Jireh Loreaux (Oct 10 2025 at 13:55):

I would request that the singular values for matrices comes as a separate PR, especially to reduce the review burden on the initial PR. Getting definitions right is always hard, so making that initial PR do as few things as possible is better.

Jireh Loreaux (Oct 10 2025 at 13:56):

As for where the file should be placed, I think Analysis.Normed.Operator is probably a good folder for this.

Niels Voss (Nov 23 2025 at 08:39):

Sorry for the delays in working on this. Just a few super minor (though hopefully not bike-shedding) questions about the definition of singular values because these are slightly more annoying to change later:

  • Should the file be SingularValues.lean or SingularValue.lean or ApproximationNumbers.lean?
  • Should it be called ContinuousLinearMap.singularValues or ContinuousLinearMap.singularValue or ContinuousLinearMap.approximationNumbers?
  • It currently uses the SeminormedAddCommGroup typeclass. Should it use NormedAddCommGroup instead? (I chose to use the most general one, because that seems like the Mathlib convention)
  • Right now it is defined as
public noncomputable def ContinuousLinearMap.singularValue {π•œ : Type*} [NontriviallyNormedField π•œ]
  {X : Type*} [SeminormedAddCommGroup X] [NormedSpace π•œ X]
  {Y : Type*} [SeminormedAddCommGroup Y] [NormedSpace π•œ Y]
  (T : X β†’L[π•œ] Y) (n : β„•) : ℝβ‰₯0 :=
  β¨… H : {S : X β†’L[π•œ] Y // S.rank ≀ ↑n}, β€–T - Hβ€–β‚Š

where the ↑n casts n to a Cardinal. But in this topic: #mathlib4 > How to express a linear map has finite rank? some people criticized the use of Cardinal. Should I attempt to avoid Cardinal in the definition?

  • Regarding the module system, should the definition be exposed or not? If not, is there an attribute that automatically generates equalities?

Kevin Buzzard (Nov 23 2025 at 14:27):

Because the rank can be infinite, if you want to avoid cardinals you'd have to say "S has finite rank and the finrank <= n" -- what you have looks more natural to me (and less of a mouthful). You might want to make some API to avoid having to think about cardinals e.g. a theorem saying "if S has finite rank and finrank <= n then singularValue <= ||T-H||_+" or maybe even the theorem saying that it's the inf of those things.

Jireh Loreaux (Nov 23 2025 at 18:41):

  • Analysis/Normed/Operator/SingularValue.lean seems nice.
  • I think singularValue is nice for discoverability. I've only heard approximation numbers in relation to operators on Banach spaces, not for matrices.
  • Seminormed is the right choice.
  • Cardinal is fine because we have no notion of erank at this point.

Niels Voss (Nov 23 2025 at 18:43):

And singularValue is preferred over singularValues? I know that there's something like LinearMap.IsSymmetric.eigenvalues, which is plural

Jireh Loreaux (Nov 23 2025 at 18:44):

That's a good point. I don't have a strong opinion about pluralization.

Levi G (Dec 07 2025 at 14:13):

Hello everyone. My name is Levi, a Statistics Student in 3rd year going to 4th year. Previously in another PR, I had attempted to formalize SVD with the help of AI but my code was not general enough and the PR did not meet the Mathlib quality standards.

Levi G (Dec 07 2025 at 14:15):

However, I have made significant progress. I believe that I have something significantly much better than I had before and believe I have addressed almost most of the main issues raised by @Jireh Loreaux in a different thread.

Levi G (Dec 07 2025 at 14:17):

According to what I was told by @Jireh Loreaux, @Niels Voss was the one working on SVD and according to some feedback I got from others, the ideal way of working in contributing to Mathlib is by discussing design of the code with others and coming up with small snippets of code.

Levi G (Dec 07 2025 at 14:19):

I would like to share that progress and get your feedback especially on whether it is the right way of doing things.

Levi G (Dec 07 2025 at 14:23):

The code is quite large which I had been told is not the ideal way of doing things. Anyways, here it is:
T17.1 - PR Prep SVD.lean

Niels Voss (Dec 07 2025 at 22:21):

Hi, for the reasons discussed in this thread, the best formalization of singular values are the approximation numbers defined in #32126. My current plan is to connect these approximation numbers with the square roots of eigenvalues of T*T for a compact operator T between Hilbert Spaces, but that proof is relatively complicated, which is why the PR is currently unfinished.

Levi G (Dec 08 2025 at 04:32):

Niels Voss said:

Hi, for the reasons discussed in this thread, the best formalization of singular values are the approximation numbers defined in #32126. My current plan is to connect these approximation numbers with the square roots of eigenvalues of T*T for a compact operator T between Hilbert Spaces, but that proof is relatively complicated, which is why the PR is currently unfinished.

Ok, understood. Thank you for feedback. You are working on infinite dimensional case which is much more difficult. But what do you think of my proof for the finite dimensional case? (Is it at least directionally useful) I can try fix the defs especially for singular values.

Levi G (Dec 09 2025 at 22:16):

@Niels Voss Here is a blueprint I have started working on that is in line with your approach (though has quite a bit of sorrys). What do you think? Is it the right approach and structure?

Levi G (Dec 09 2025 at 22:45):

CompactOperatorSVD.lean

Levi G (Dec 10 2025 at 00:15):

The above:point_up: is Work-in-Progress hence the many sorrys. Would love hear your feedback.

Johan Commelin (Dec 10 2025 at 04:37):

What happened with your claim to drop this project?
Levi G said:

Johan Commelin Understood. It's just that I was hoping for some more feedback because this latest version is significantly better than anything that I had given before and also attempted to address their feedback. I hope you can at least take a look at this one. However, I have understood your point, it is a minefield so I will just stop pressing on this matter, hold-off completely and move on to other projects. Thank you for your feedback.

Snir Broshi (Dec 12 2025 at 02:07):

Another SVD implementation #mathlib4 > Matrix analysis

Levi G (Dec 12 2025 at 20:14):

Johan Commelin said:

What happened with your claim to drop this project?
Levi G said:

Johan Commelin Understood. It's just that I was hoping for some more feedback because this latest version is significantly better than anything that I had given before and also attempted to address their feedback. I hope you can at least take a look at this one. However, I have understood your point, it is a minefield so I will just stop pressing on this matter, hold-off completely and move on to other projects. Thank you for your feedback.

Sorry @Johan Commelin. After I coming up with the finite-dimensional SVD, I couldn't help myself. I wanted feedback on it and wanted to be of assistance to @Niels Voss who was working on the more general and difficult case of SVD for compact operators on Hilbet Spaces and its API. So I couldn't help myself, please. As of right now I have made significant progress on the general case and will soon be sharing progress if that's alright.

Levi G (Dec 12 2025 at 20:23):

The progress being:

All s-number axioms proven (S2, S3 left/right)

Rayleigh Quotient Theorem - Infinite dimensional case for Compact Self-Adjoint Operators

Spectral Theorem for Compact Self-Adjoint Operators on Hilbert Spaces

Compactness characterization of operators

Levi G (Dec 12 2025 at 20:43):

It's a big file though.

Jireh Loreaux (Dec 12 2025 at 21:39):

@Levi G the fact that it's a big file is part of the problem. At this point, you don't have experience contributing to Mathlib, so a contribution of this magnitude is very hard to review. If you start with smaller projects and learn good design patterns from receiving review, then it gets easier because your code is higher quality. Let me highlight one example as evidence:

theorem IsCompactOperator.eigenvalue_tendsto_zero
    (hT : IsCompactOperator T) (_hSA : IsSelfAdjoint T)
    {ΞΌ : β„• β†’ ℝ} {e : β„• β†’ E}
    (he_orth : Orthonormal π•œ e)
    (he_eigen : βˆ€ n, T (e n) = (ΞΌ n : π•œ) β€’ (e n))
    (hΞΌ_antitone : Antitone (fun n ↦ |ΞΌ n|)) :
    Tendsto (fun n ↦ |ΞΌ n|) atTop (𝓝 0) := by

This theorem has nothing to do with selfadjiont operators, or real eigenvalues, or Antitone in any fundamental way. That is, what's really going on here is that since e is orthonormal, is converges weakly to zero, and compact operators take weakly convergent sequences to norm convergent ones.

Again, I would love for you to become a productive member of the community, but please start with smaller projects. It may even be the case that there are peices of your code worth salvaging, but I don't have the energy or the time to try to sift through 2000 lines of code looking for a diamond in the rough.

Levi G (Dec 12 2025 at 22:20):

@Jireh Loreaux Thank you for your feedback and time spent reviewing my code. Long, bad-quality code is hard to review - Understood. My new file is around 3k lines - too long.

Levi G (Dec 12 2025 at 22:24):

In light of the difficulty of reviewing long low quality code - I'll share what I think are the parts of interest in my new code: the Rayleigh Quotient Theorem and Spectral Theorem for Compact Self-Adjoint Operators on Hilbert Spaces. Then hold-off from the project (finally) and move on to smaller project where I can improve.

Levi G (Dec 12 2025 at 22:51):

/-- Key Lemma 2: For a nonzero compact self-adjoint operator, β€–Tβ€– or -β€–Tβ€– is an eigenvalue.

The proof uses the Rayleigh quotient: for self-adjoint `T`, the operator norm is the
supremum of the absolute Rayleigh quotient on the unit sphere. For compact `T`, we
build a maximizing sequence on the unit sphere and use compactness to extract a
convergent subsequence, whose limit is an eigenvector.

This theorem is the compact self-adjoint case that Mathlib's `Rayleigh.lean` leaves
as a TODO; here we supply a standalone implementation using `hOpNorm_eq_sSup_absRayleigh`
(proven via the Polarization Identity) as the key spectral-theoretic input.
-/
theorem IsSelfAdjoint.exists_norm_eq_abs_eigenvalue (hSA : IsSelfAdjoint T)
    (hT : IsCompactOperator T) (hT_ne : T β‰  0) :
    βˆƒ (ΞΌ : ℝ) (e : E), β€–eβ€– = 1 ∧ T e = (ΞΌ : π•œ) β€’ e ∧ |ΞΌ| = β€–Tβ€– := by
  -- ═══════════════════════════════════════════════════════════════════════════
  -- PROOF STRATEGY (completing the TODO in Mathlib's Rayleigh.lean):
  -- ═══════════════════════════════════════════════════════════════════════════
  --
  -- For a nonzero compact self-adjoint operator T:
  -- 1. Either sup or inf of the Rayleigh quotient has absolute value = β€–Tβ€–
  -- 2. Take a maximizing (or minimizing) sequence on the unit sphere
  -- 3. Use compactness to extract a convergent subsequence of T(xβ‚™)
  -- 4. Show the sequence xβ‚™ itself converges using β€–Txβ‚™ - ΞΌxβ‚™β€– β†’ 0
  -- 5. The limit is an eigenvector with eigenvalue ΞΌ where |ΞΌ| = β€–Tβ€–
  --
  -- Key insight: For self-adjoint T and maximizing sequence xβ‚™ with βŸͺTxβ‚™,xβ‚™βŸ« β†’ M:
  --   β€–Txβ‚™ - MΒ·xβ‚™β€–Β² = β€–Txβ‚™β€–Β² - 2MβŸͺTxβ‚™,xβ‚™βŸ« + MΒ²
  --                 = βŸͺTΒ²xβ‚™,xβ‚™βŸ« - 2MβŸͺTxβ‚™,xβ‚™βŸ« + MΒ² β†’ 0
  -- because βŸͺTΒ²xβ‚™,xβ‚™βŸ« ≀ β€–Tβ€–Β·βŸͺTxβ‚™,xβ‚™βŸ« β†’ β€–Tβ€–Β·M and M = β€–Tβ€– or M = -β€–Tβ€–.
  -- ═══════════════════════════════════════════════════════════════════════════

  -- We use hasEigenvector_of_isMaxOn/isMinOn from Rayleigh.lean
  -- The key is to show the Rayleigh quotient attains its extremum for compact T

  -- Step 1: Define the Rayleigh quotient supremum and infimum
  let M_sup := ⨆ x : { x : E // x β‰  0 }, T.rayleighQuotient x
  let M_inf := β¨… x : { x : E // x β‰  0 }, T.rayleighQuotient x

  -- Step 2: For self-adjoint T, β€–Tβ€– = max(|M_sup|, |M_inf|)
  -- So either |M_sup| = β€–Tβ€– or |M_inf| = β€–Tβ€–

  -- Step 3: The closure of T applied to the unit sphere is compact
  have hK : IsCompact (closure (T '' Metric.sphere (0 : E) 1)) := by
    have h1 : Metric.sphere (0 : E) 1 βŠ† Metric.closedBall 0 1 := Metric.sphere_subset_closedBall
    have h2 : T '' Metric.sphere (0 : E) 1 βŠ† T '' Metric.closedBall 0 1 := Set.image_mono h1
    have h3 : closure (T '' Metric.sphere (0 : E) 1) βŠ† closure (T '' Metric.closedBall 0 1) :=
      closure_mono h2
    exact (hT.isCompact_closure_image_closedBall 1).of_isClosed_subset isClosed_closure h3

  -- Step 4: Get a maximizing sequence and use compactness
  -- Since T β‰  0, the sphere is nonempty and there exist nonzero vectors
  have hE_nontrivial : Nontrivial E := by
    by_contra h
    have : Subsingleton E := not_nontrivial_iff_subsingleton.mp h
    have hT_eq : T = 0 := by ext x; simp [Subsingleton.eq_zero x]
    exact hT_ne hT_eq

  obtain ⟨vβ‚€, hvβ‚€βŸ© : βˆƒ vβ‚€ : E, vβ‚€ β‰  0 := exists_ne 0
  have hSphere_nonempty : (Metric.sphere (0 : E) 1).Nonempty := by
    use (β€–v₀‖⁻¹ : π•œ) β€’ vβ‚€
    simp only [Metric.mem_sphere, dist_zero_right, norm_smul]
    rw [norm_inv, RCLike.norm_coe_norm, inv_mul_cancelβ‚€]
    exact norm_ne_zero_iff.mpr hvβ‚€

  -- The proof proceeds by finding where the Rayleigh quotient achieves its maximum or minimum
  -- Using sequential compactness from the compact operator

  -- For the full proof, we need to:
  -- 1. Take a sequence xβ‚™ in the unit sphere with βŸͺTxβ‚™, xβ‚™βŸ« approaching sup (or inf)
  -- 2. Extract a convergent subsequence of Txβ‚™ using compactness
  -- 3. Show xβ‚™ converges using the self-adjoint property
  -- 4. Apply hasEigenvector_of_isMaxOn or hasEigenvector_of_isMinOn

  -- The Rayleigh quotient is continuous
  have hRayleigh_cont : Continuous T.reApplyInnerSelf := T.reApplyInnerSelf_continuous

  -- For compact operators, we use sequential compactness
  -- The image of unit sphere under T is relatively compact

  -- Key technical step: Show that a maximizing sequence converges
  -- This requires showing β€–Txβ‚™ - ΞΌxβ‚™β€– β†’ 0 for the limiting eigenvalue ΞΌ

  -- We show existence by cases: either the sup or the inf gives us β€–Tβ€–

  -- Case analysis: we consider whether sup β‰₯ |inf| or inf ≀ -|sup|
  -- In either case, we get an eigenvalue with absolute value β€–Tβ€–

  -- The detailed proof requires careful analysis of the maximizing sequence.
  -- Here we provide the complete argument:

  -- First, note that for any unit vector x: |βŸͺTx, x⟫| ≀ β€–Tβ€–
  have hRayleigh_bound : βˆ€ x : E, β€–xβ€– = 1 β†’ |RCLike.re βŸͺT x, x⟫| ≀ β€–Tβ€– := by
    intro x hx
    have h1 : β€–βŸͺT x, xβŸ«β€– ≀ β€–T xβ€– * β€–xβ€– := norm_inner_le_norm (T x) x
    have h2 : β€–T xβ€– ≀ β€–Tβ€– * β€–xβ€– := T.le_opNorm x
    calc |RCLike.re βŸͺT x, x⟫| ≀ β€–βŸͺT x, xβŸ«β€– := RCLike.abs_re_le_norm _
      _ ≀ β€–T xβ€– * β€–xβ€– := h1
      _ ≀ β€–Tβ€– * β€–xβ€– * β€–xβ€– := by nlinarith [norm_nonneg x, norm_nonneg (T x)]
      _ = β€–Tβ€– := by rw [hx]; ring

  -- Local spectral-norm characterization for self-adjoint operators:
  -- the operator norm equals the supremum of the absolute Rayleigh quotient
  -- over the unit sphere.
  -- This is a standard result in functional analysis (see e.g., Reed-Simon Vol 1).
  have hOpNorm_eq_sSup_absRayleigh
      (hSA : IsSelfAdjoint T) :
      sSup (Set.range (fun x : {x : E // β€–xβ€– = 1} ↦ |RCLike.re βŸͺT x, x⟫|)) = β€–Tβ€– := by
    classical
    let S := Set.range (fun x : {x : E // β€–xβ€– = 1} ↦ |RCLike.re βŸͺT x, x⟫|)
    -- Upper bound: |βŸͺTx, x⟫| ≀ β€–Tβ€– for unit vectors
    have hS_upper : βˆ€ r ∈ S, r ≀ β€–Tβ€– := by
      rintro r ⟨x, rfl⟩
      calc |RCLike.re βŸͺT x, x⟫| ≀ β€–βŸͺT x, xβŸ«β€– := RCLike.abs_re_le_norm _
        _ ≀ β€–T (x : E)β€– * β€–(x : E)β€– := norm_inner_le_norm _ _
        _ ≀ β€–Tβ€– * β€–(x : E)β€– * β€–(x : E)β€– := by gcongr; exact T.le_opNorm _
        _ = β€–Tβ€– := by rw [x.property]; ring
    have hS_bdd : BddAbove S := βŸ¨β€–Tβ€–, hS_upper⟩
    have hS_nonneg : βˆ€ r ∈ S, 0 ≀ r := by
      rintro r ⟨x, rfl⟩
      exact abs_nonneg _
    -- Nonemptiness: we need a unit vector to show S is nonempty
    have hS_nonempty : S.Nonempty := by
      obtain ⟨v, hv⟩ := hSphere_nonempty
      have hv_norm : β€–vβ€– = 1 := mem_sphere_zero_iff_norm.mp hv
      let u : {x : E // β€–xβ€– = 1} := ⟨v, hv_norm⟩
      refine ⟨|RCLike.re βŸͺT u, u⟫|, u, rfl⟩
    apply le_antisymm
    Β· exact csSup_le hS_nonempty hS_upper
    -- Lower bound: β€–Tβ€– ≀ sSup S using the Polarization Identity
    -- For self-adjoint T: 4 ReβŸͺTx, y⟫ = ReβŸͺT(x+y), x+y⟫ - ReβŸͺT(x-y), x-y⟫
    -- Combined with the parallelogram law, this gives |ReβŸͺTu, v⟫| ≀ M for unit u, v
    Β· let M := sSup S
      -- Step 1: M is an upper bound for |ReβŸͺTu, u⟫| when β€–uβ€– = 1
      have hM_bound : βˆ€ z : E, β€–zβ€– = 1 β†’ |RCLike.re βŸͺT z, z⟫| ≀ M := fun z hz ↦
        le_csSup hS_bdd ⟨⟨z, hz⟩, rfl⟩
      -- Step 2: Extend to |ReβŸͺTv, v⟫| ≀ M * β€–vβ€–Β² for all v
      have h_bound_all : βˆ€ v : E, |RCLike.re βŸͺT v, v⟫| ≀ M * β€–vβ€–^2 := by
        intro v
        by_cases hv : v = 0
        Β· simp [hv]
        have hv_pos : 0 < β€–vβ€– := norm_pos_iff.mpr hv
        let u := (↑‖v‖⁻¹ : π•œ) β€’ v
        have hu : β€–uβ€– = 1 := by
          simp only [u, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hv_pos)]
          exact inv_mul_cancelβ‚€ (ne_of_gt hv_pos)
        have h_le := hM_bound u hu
        -- v = β€–vβ€– β€’ u, so βŸͺTv, v⟫ = β€–vβ€–Β² * βŸͺTu, u⟫
        have hv_eq : v = (↑‖vβ€– : π•œ) β€’ u := by
          simp only [u, smul_smul, RCLike.ofReal_inv]
          rw [mul_inv_cancelβ‚€ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hv_pos)), one_smul]
        have h_inner_scale : βŸͺT v, v⟫ = (↑(β€–vβ€–^2) : π•œ) * βŸͺT u, u⟫ := by
          conv_lhs => rw [hv_eq]
          simp only [map_smul, inner_smul_left, inner_smul_right, RCLike.conj_ofReal]
          rw [sq, RCLike.ofReal_mul]; ring
        rw [h_inner_scale]
        simp only [RCLike.mul_re, RCLike.ofReal_re, RCLike.ofReal_im, zero_mul, sub_zero]
        rw [abs_mul, abs_of_pos (sq_pos_of_pos hv_pos)]
        calc β€–vβ€–^2 * |RCLike.re βŸͺT u, u⟫| ≀ β€–vβ€–^2 * M :=
               mul_le_mul_of_nonneg_left h_le (sq_nonneg _)
          _ = M * β€–vβ€–^2 := mul_comm _ _
      -- Step 3: Self-adjoint symmetry: ReβŸͺTx, y⟫ = ReβŸͺTy, x⟫
      have hSA_symm : βˆ€ a b : E, RCLike.re βŸͺT a, b⟫ = RCLike.re βŸͺT b, a⟫ := by
        intro a b
        -- For self-adjoint T: βŸͺTa, b⟫ = βŸͺa, Tb⟫ = conj βŸͺTb, a⟫
        -- Taking Re of both sides: ReβŸͺTa, b⟫ = Re(conj βŸͺTb, a⟫) = ReβŸͺTb, a⟫
        have h1 : βŸͺT a, b⟫ = βŸͺa, T b⟫ := by
          -- adjoint_inner_right: βŸͺb, (adjoint T) a⟫ = βŸͺT b, a⟫
          have h := ContinuousLinearMap.adjoint_inner_right T b a
          rw [hSA.adjoint_eq] at h
          -- h : βŸͺb, T a⟫ = βŸͺT b, a⟫
          -- We need: βŸͺT a, b⟫ = βŸͺa, T b⟫
          -- inner_conj_symm x y : conj βŸͺy, x⟫ = βŸͺx, y⟫
          -- So inner_conj_symm (T a) b : conj βŸͺb, T a⟫ = βŸͺT a, b⟫
          -- Thus (inner_conj_symm (T a) b).symm : βŸͺT a, b⟫ = conj βŸͺb, T a⟫
          have h2 : βŸͺT a, b⟫ = starRingEnd π•œ βŸͺb, T a⟫ := (inner_conj_symm (T a) b).symm
          have h3 : βŸͺa, T b⟫ = starRingEnd π•œ βŸͺT b, a⟫ := (inner_conj_symm a (T b)).symm
          rw [h2, h, h3]
        calc RCLike.re βŸͺT a, b⟫ = RCLike.re βŸͺa, T b⟫ := by rw [h1]
          _ = RCLike.re (starRingEnd π•œ βŸͺT b, a⟫) := by rw [inner_conj_symm]
          _ = RCLike.re βŸͺT b, a⟫ := RCLike.conj_re _
      -- Step 4: Polarization identity for self-adjoint T
      have h_polarization : βˆ€ x y : E,
          4 * RCLike.re βŸͺT x, y⟫ = RCLike.re βŸͺT (x+y), x+y⟫ - RCLike.re βŸͺT (x-y), x-y⟫ := by
        intro x y
        simp only [map_add, map_sub, inner_add_left, inner_add_right,
          inner_sub_left, inner_sub_right, map_add, map_sub]
        have h_sym := hSA_symm x y
        linarith

Levi G (Dec 12 2025 at 22:56):

Β  Β  Β  -- Step 5: For unit vectors u, v: |ReβŸͺTu, v⟫| ≀ M

Β  Β  Β  have h_unit_bound : βˆ€ u v : E, β€–uβ€– = 1 β†’ β€–vβ€– = 1 β†’ |RCLike.re βŸͺT u, v⟫| ≀ M := by

Β  Β  Β  Β  intro u v hu hv

Β  Β  Β  Β  have h_pol := h_polarization u v

Β  Β  Β  Β  -- Parallelogram law: β€–u+vβ€–Β² + β€–u-vβ€–Β² = 2(β€–uβ€–Β² + β€–vβ€–Β²)

Β  Β  Β  Β  have h_par : β€–u + vβ€– * β€–u + vβ€– + β€–u - vβ€– * β€–u - vβ€– = 2 * (β€–uβ€– * β€–uβ€– + β€–vβ€– * β€–vβ€–) :=

Β  Β  Β  Β  Β  parallelogram_law_with_norm π•œ u v

Β  Β  Β  Β  rw [hu, hv] at h_par

Β  Β  Β  Β  simp only [one_mul] at h_par

Β  Β  Β  Β  -- Convert to ^2 form

Β  Β  Β  Β  have h_par' : β€–u + vβ€–^2 + β€–u - vβ€–^2 = 4 := by

Β  Β  Β  Β  Β  simp only [sq]; linarith

Β  Β  Β  Β  -- |4 * ReβŸͺTu, v⟫| ≀ |ReβŸͺT(u+v), u+v⟫| + |ReβŸͺT(u-v), u-v⟫|

Β  Β  Β  Β  have h_abs_le : |4 * RCLike.re βŸͺT u, v⟫| ≀

Β  Β  Β  Β  Β  Β  |RCLike.re βŸͺT (u+v), u+v⟫| + |RCLike.re βŸͺT (u-v), u-v⟫| := by

Β  Β  Β  Β  Β  rw [h_pol]

Β  Β  Β  Β  Β  exact abs_sub _ _

Β  Β  Β  Β  calc |RCLike.re βŸͺT u, v⟫|

Β  Β  Β  Β  Β  = |4 * RCLike.re βŸͺT u, v⟫| / 4 := by

Β  Β  Β  Β  Β  Β  Β  rw [abs_mul, abs_of_pos (by norm_num : (0:ℝ) < 4)]; ring

Β  Β  Β  Β  Β  _ ≀ (|RCLike.re βŸͺT (u+v), u+v⟫| + |RCLike.re βŸͺT (u-v), u-v⟫|) / 4 := by

Β  Β  Β  Β  Β  Β  Β  apply div_le_div_of_nonneg_right h_abs_le (by norm_num : (0:ℝ) ≀ 4)

Β  Β  Β  Β  Β  _ ≀ (M * β€–u+vβ€–^2 + M * β€–u-vβ€–^2) / 4 := by

Β  Β  Β  Β  Β  Β  Β  apply div_le_div_of_nonneg_right _ (by norm_num : (0:ℝ) ≀ 4)

Β  Β  Β  Β  Β  Β  Β  exact add_le_add (h_bound_all _) (h_bound_all _)

Β  Β  Β  Β  Β  _ = M * (β€–u+vβ€–^2 + β€–u-vβ€–^2) / 4 := by ring

Β  Β  Β  Β  Β  _ = M * 4 / 4 := by rw [h_par']

Β  Β  Β  Β  Β  _ = M := by ring

Β  Β  Β  -- Step 6: Use opNorm_le_bound with the polarization bound

Β  Β  Β  apply ContinuousLinearMap.opNorm_le_bound _ (Real.sSup_nonneg hS_nonneg)

Β  Β  Β  intro x

Β  Β  Β  by_cases hx : x = 0

Β  Β  Β  Β· simp [hx]

Β  Β  Β  have hx_pos : 0 < β€–xβ€– := norm_pos_iff.mpr hx

Β  Β  Β  -- Normalize x to unit vector

Β  Β  Β  let u := (↑‖x‖⁻¹ : π•œ) β€’ x

Β  Β  Β  have hu : β€–uβ€– = 1 := by

Β  Β  Β  Β  simp only [u, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hx_pos)]

Β  Β  Β  Β  exact inv_mul_cancelβ‚€ (ne_of_gt hx_pos)

Β  Β  Β  by_cases hTu : T u = 0

Β  Β  Β  Β· -- If Tu = 0, then Tx = β€–xβ€– β€’ Tu = 0, so β€–Txβ€– = 0 ≀ M * β€–xβ€–

Β  Β  Β  Β  have hTx : T x = (↑‖xβ€– : π•œ) β€’ T u := by

Β  Β  Β  Β  Β  simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]

Β  Β  Β  Β  Β  rw [mul_inv_cancelβ‚€ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]

Β  Β  Β  Β  simp only [hTx, hTu, smul_zero, norm_zero]

Β  Β  Β  Β  exact mul_nonneg (Real.sSup_nonneg hS_nonneg) (norm_nonneg _)

Β  Β  Β  -- Use v = Tu / β€–Tuβ€– as the test vector

Β  Β  Β  have hTu_pos : 0 < β€–T uβ€– := norm_pos_iff.mpr hTu

Β  Β  Β  let v := (↑‖T u‖⁻¹ : π•œ) β€’ T u

Β  Β  Β  have hv : β€–vβ€– = 1 := by

Β  Β  Β  Β  simp only [v, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hTu_pos)]

Β  Β  Β  Β  exact inv_mul_cancelβ‚€ (ne_of_gt hTu_pos)

Β  Β  Β  -- βŸͺTu, v⟫ = βŸͺTu, Tu/β€–Tuβ€–βŸ« = β€–Tuβ€–

Β  Β  Β  have h_inner : RCLike.re βŸͺT u, v⟫ = β€–T uβ€– := by

Β  Β  Β  Β  simp only [v, inner_smul_right]

Β  Β  Β  Β  rw [inner_self_eq_norm_sq_to_K]

Β  Β  Β  Β  simp only [RCLike.mul_re, RCLike.ofReal_re, RCLike.ofReal_im, zero_mul, sub_zero]

Β  Β  Β  Β  have h_re : RCLike.re ((↑‖T uβ€– : π•œ) ^ 2) = β€–T uβ€–^2 := by

Β  Β  Β  Β  Β  rw [← RCLike.ofReal_pow]; exact RCLike.ofReal_re _

Β  Β  Β  Β  rw [h_re]

Β  Β  Β  Β  field_simp [ne_of_gt hTu_pos]

Β  Β  Β  have h_Tu_le : β€–T uβ€– ≀ M := by

Β  Β  Β  Β  have := h_unit_bound u v hu hv

Β  Β  Β  Β  rw [h_inner, abs_of_pos hTu_pos] at this

Β  Β  Β  Β  exact this

Β  Β  Β  -- Tx = β€–xβ€– β€’ Tu, so β€–Txβ€– = β€–xβ€– * β€–Tuβ€– ≀ β€–xβ€– * M = M * β€–xβ€–

Β  Β  Β  have hTx : T x = (↑‖xβ€– : π•œ) β€’ T u := by

Β  Β  Β  Β  simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]

Β  Β  Β  Β  rw [mul_inv_cancelβ‚€ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]

Β  Β  Β  calc β€–T xβ€– = β€–(↑‖xβ€– : π•œ) β€’ T uβ€– := by rw [hTx]

Β  Β  Β  Β  _ = β€–(↑‖xβ€– : π•œ)β€– * β€–T uβ€– := norm_smul _ _

Β  Β  Β  Β  _ = β€–xβ€– * β€–T uβ€– := by simp only [RCLike.norm_ofReal, abs_of_pos hx_pos]

Β  Β  Β  Β  _ ≀ β€–xβ€– * M := by nlinarith [h_Tu_le, hx_pos]

Β  Β  Β  Β  _ = M * β€–xβ€– := mul_comm _ _      -- Step 5: For unit vectors u, v: |ReβŸͺTu, v⟫| ≀ M
      have h_unit_bound : βˆ€ u v : E, β€–uβ€– = 1 β†’ β€–vβ€– = 1 β†’ |RCLike.re βŸͺT u, v⟫| ≀ M := by
        intro u v hu hv
        have h_pol := h_polarization u v
        -- Parallelogram law: β€–u+vβ€–Β² + β€–u-vβ€–Β² = 2(β€–uβ€–Β² + β€–vβ€–Β²)
        have h_par : β€–u + vβ€– * β€–u + vβ€– + β€–u - vβ€– * β€–u - vβ€– = 2 * (β€–uβ€– * β€–uβ€– + β€–vβ€– * β€–vβ€–) :=
          parallelogram_law_with_norm π•œ u v
        rw [hu, hv] at h_par
        simp only [one_mul] at h_par
        -- Convert to ^2 form
        have h_par' : β€–u + vβ€–^2 + β€–u - vβ€–^2 = 4 := by
          simp only [sq]; linarith
        -- |4 * ReβŸͺTu, v⟫| ≀ |ReβŸͺT(u+v), u+v⟫| + |ReβŸͺT(u-v), u-v⟫|
        have h_abs_le : |4 * RCLike.re βŸͺT u, v⟫| ≀
            |RCLike.re βŸͺT (u+v), u+v⟫| + |RCLike.re βŸͺT (u-v), u-v⟫| := by
          rw [h_pol]
          exact abs_sub _ _
        calc |RCLike.re βŸͺT u, v⟫|
          = |4 * RCLike.re βŸͺT u, v⟫| / 4 := by
              rw [abs_mul, abs_of_pos (by norm_num : (0:ℝ) < 4)]; ring
          _ ≀ (|RCLike.re βŸͺT (u+v), u+v⟫| + |RCLike.re βŸͺT (u-v), u-v⟫|) / 4 := by
              apply div_le_div_of_nonneg_right h_abs_le (by norm_num : (0:ℝ) ≀ 4)
          _ ≀ (M * β€–u+vβ€–^2 + M * β€–u-vβ€–^2) / 4 := by
              apply div_le_div_of_nonneg_right _ (by norm_num : (0:ℝ) ≀ 4)
              exact add_le_add (h_bound_all _) (h_bound_all _)
          _ = M * (β€–u+vβ€–^2 + β€–u-vβ€–^2) / 4 := by ring
          _ = M * 4 / 4 := by rw [h_par']
          _ = M := by ring
      -- Step 6: Use opNorm_le_bound with the polarization bound
      apply ContinuousLinearMap.opNorm_le_bound _ (Real.sSup_nonneg hS_nonneg)
      intro x
      by_cases hx : x = 0
      Β· simp [hx]
      have hx_pos : 0 < β€–xβ€– := norm_pos_iff.mpr hx
      -- Normalize x to unit vector
      let u := (↑‖x‖⁻¹ : π•œ) β€’ x
      have hu : β€–uβ€– = 1 := by
        simp only [u, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hx_pos)]
        exact inv_mul_cancelβ‚€ (ne_of_gt hx_pos)
      by_cases hTu : T u = 0
      Β· -- If Tu = 0, then Tx = β€–xβ€– β€’ Tu = 0, so β€–Txβ€– = 0 ≀ M * β€–xβ€–
        have hTx : T x = (↑‖xβ€– : π•œ) β€’ T u := by
          simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]
          rw [mul_inv_cancelβ‚€ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]
        simp only [hTx, hTu, smul_zero, norm_zero]
        exact mul_nonneg (Real.sSup_nonneg hS_nonneg) (norm_nonneg _)
      -- Use v = Tu / β€–Tuβ€– as the test vector
      have hTu_pos : 0 < β€–T uβ€– := norm_pos_iff.mpr hTu
      let v := (↑‖T u‖⁻¹ : π•œ) β€’ T u
      have hv : β€–vβ€– = 1 := by
        simp only [v, norm_smul, RCLike.norm_ofReal, abs_of_pos (inv_pos.mpr hTu_pos)]
        exact inv_mul_cancelβ‚€ (ne_of_gt hTu_pos)
      -- βŸͺTu, v⟫ = βŸͺTu, Tu/β€–Tuβ€–βŸ« = β€–Tuβ€–
      have h_inner : RCLike.re βŸͺT u, v⟫ = β€–T uβ€– := by
        simp only [v, inner_smul_right]
        rw [inner_self_eq_norm_sq_to_K]
        simp only [RCLike.mul_re, RCLike.ofReal_re, RCLike.ofReal_im, zero_mul, sub_zero]
        have h_re : RCLike.re ((↑‖T uβ€– : π•œ) ^ 2) = β€–T uβ€–^2 := by
          rw [← RCLike.ofReal_pow]; exact RCLike.ofReal_re _
        rw [h_re]
        field_simp [ne_of_gt hTu_pos]
      have h_Tu_le : β€–T uβ€– ≀ M := by
        have := h_unit_bound u v hu hv
        rw [h_inner, abs_of_pos hTu_pos] at this
        exact this
      -- Tx = β€–xβ€– β€’ Tu, so β€–Txβ€– = β€–xβ€– * β€–Tuβ€– ≀ β€–xβ€– * M = M * β€–xβ€–
      have hTx : T x = (↑‖xβ€– : π•œ) β€’ T u := by
        simp only [u, map_smul, smul_smul, RCLike.ofReal_inv]
        rw [mul_inv_cancelβ‚€ (RCLike.ofReal_ne_zero.mpr (ne_of_gt hx_pos)), one_smul]
      calc β€–T xβ€– = β€–(↑‖xβ€– : π•œ) β€’ T uβ€– := by rw [hTx]
        _ = β€–(↑‖xβ€– : π•œ)β€– * β€–T uβ€– := norm_smul _ _
        _ = β€–xβ€– * β€–T uβ€– := by simp only [RCLike.norm_ofReal, abs_of_pos hx_pos]
        _ ≀ β€–xβ€– * M := by nlinarith [h_Tu_le, hx_pos]
        _ = M * β€–xβ€– := mul_comm _ _

Levi G (Dec 12 2025 at 23:00):

 -- Above we proved `hOpNorm_eq_sSup_absRayleigh`: the operator norm of a

Β  -- self-adjoint operator equals the supremum of the absolute Rayleigh quotient

Β  -- over the unit sphere. The proof uses the Polarization Identity:

Β  -- Β  4 ReβŸͺTx, y⟫ = ReβŸͺT(x+y), x+y⟫ - ReβŸͺT(x-y), x-y⟫

Β  -- Combined with the Parallelogram Law, this yields |ReβŸͺTu, v⟫| ≀ M for

Β  -- unit vectors u, v (where M = sSup S), from which β€–Tβ€– ≀ M follows.

Β  --

Β  -- Now we use this to find eigenvectors. For compact T, we use the sequential

Β  -- characterization: there exists a sequence xβ‚™ with β€–xβ‚™β€– = 1 and

Β  -- |βŸͺTxβ‚™, xβ‚™βŸ«| β†’ β€–Tβ€–. By compactness, Txβ‚™ has a convergent subsequence, and

Β  -- the self-adjoint property gives convergence of xβ‚™ itself.

Β  --

Β  -- These technical steps (maximizing sequence, subsequence extraction, and

Β  -- passage to the limit) are carried out explicitly below in

Β  -- `exists_eigenvector_from_maximizing_seq`, rather than being delegated

Β  -- directly to the Rayleigh lemmas `hasEigenvector_of_isMaxOn` and

Β  -- `hasEigenvector_of_isMinOn`.

Β  -- Helper lemma: from a maximizing sequence for the Rayleigh quotient,

Β  -- extract an eigenvector whose eigenvalue has absolute value equal to the

Β  -- limit of the Rayleigh values.

Β  have exists_eigenvector_from_maximizing_seq

Β  Β  Β  (x : β„• β†’ E) (hx_norm : βˆ€ n, β€–x nβ€– = 1)

Β  Β  Β  (L : ℝ) (hL_nonneg : 0 ≀ L) (hL_eq_norm : L = β€–Tβ€–)

Β  Β  Β  (hx_lim : Tendsto (fun n ↦ |RCLike.re βŸͺT (x n), x n⟫|) atTop (𝓝 L)) :

Β  Β  Β  βˆƒ (ΞΌ : ℝ) (e : E), β€–eβ€– = 1 ∧ T e = (ΞΌ : π•œ) β€’ e ∧ |ΞΌ| = L := by

Β  Β  -- ═══════════════════════════════════════════════════════════════════════

Β  Β  -- PROOF STRUCTURE:

Β  Β  -- Step 1: Extract subsequence with constant sign using tendsto_abs_extract_subseq

Β  Β  -- Step 2: Extract convergent subsequence of T x using compactness

Β  Β  -- Step 3: Show x itself converges using tendsto_norm_sub_smul_zero

Β  Β  -- Step 4: Pass to limit to get eigenvector equation

Β  Β  -- ═══════════════════════════════════════════════════════════════════════

Β  Β  -- Step 1: Extract subsequence where Re βŸͺT x_n, x_n⟫ β†’ Οƒ * L (with |Οƒ| = 1)

Β  Β  obtain βŸ¨Ο†β‚, Οƒ, hφ₁_mono, hΟƒ, hφ₁_lim⟩ :=

Β  Β  Β  tendsto_abs_extract_subseq hL_nonneg hx_lim

Β  Β  -- The subsequence x ∘ φ₁ still has unit norm

Β  Β  have hx_φ₁_norm : βˆ€ n, β€–x (φ₁ n)β€– = 1 := fun n ↦ hx_norm (φ₁ n)

Β  Β  -- The eigenvalue candidate is ΞΌ = Οƒ * L

Β  Β  let ΞΌ := Οƒ * L

Β  Β  -- |ΞΌ| = |Οƒ| * L = 1 * L = L (since |Οƒ| = 1 and L β‰₯ 0)

Β  Β  have hΞΌ_abs : |ΞΌ| = L := by

Β  Β  Β  calc |ΞΌ| = |Οƒ * L| := rfl

Β  Β  Β  Β  _ = |Οƒ| * |L| := abs_mul Οƒ L

        _ = 1 * |L| := by rw [hσ]

Β  Β  Β  Β  _ = |L| := one_mul _

Β  Β  Β  Β  _ = L := abs_of_nonneg hL_nonneg

Β  Β  -- Step 2: Use compactness to extract convergent subsequence of T (x (φ₁ n))

Β  Β  -- The sequence T (x (φ₁ n)) lies in closure (T '' sphere 0 1) which is compact

Β  Β  have hT_in_compact : βˆ€ n, T (x (φ₁ n)) ∈ closure (T '' Metric.sphere (0 : E) 1) := by

Β  Β  Β  intro n

Β  Β  Β  apply subset_closure

Β  Β  Β  exact ⟨x (φ₁ n), Metric.mem_sphere.mpr (by simp [hx_φ₁_norm n]), rfl⟩

Β  Β  obtain ⟨y, hy_mem, Ο†β‚‚, hΟ†β‚‚_mono, hTx_lim⟩ := hK.tendsto_subseq hT_in_compact

Β  Β  -- Combined subsequence

Β  Β  let Ο† := φ₁ ∘ Ο†β‚‚

Β  Β  have hΟ†_mono : StrictMono Ο† := hφ₁_mono.comp hΟ†β‚‚_mono

Β  Β  -- Properties of the combined subsequence

Β  Β  have hx_Ο†_norm : βˆ€ n, β€–x (Ο† n)β€– = 1 := fun n ↦ hx_norm (Ο† n)

Β  Β  have hTx_Ο†_lim : Tendsto (fun n ↦ T (x (Ο† n))) atTop (𝓝 y) := hTx_lim

Β  Β  have hRayleigh_Ο†_lim : Tendsto (fun n ↦ RCLike.re βŸͺT (x (Ο† n)), x (Ο† n)⟫) atTop (𝓝 ΞΌ) :=

Β  Β  Β  hφ₁_lim.comp hΟ†β‚‚_mono.tendsto_atTop

Β  Β  -- Step 3: Show β€–T x_Ο†(n) - ΞΌ x_Ο†(n)β€– β†’ 0 using tendsto_norm_sub_smul_zero

Β  Β  have hΞΌ_eq_norm : |ΞΌ| = β€–Tβ€– := by rw [hΞΌ_abs, hL_eq_norm]

Β  Β  have h_almost_eig : Tendsto (fun n ↦ β€–T (x (Ο† n)) - (ΞΌ : π•œ) β€’ x (Ο† n)β€–) atTop (𝓝 0) :=

Β  Β  Β  IsSelfAdjoint.tendsto_norm_sub_smul_zero (T := T) hSA hx_Ο†_norm hΞΌ_eq_norm hRayleigh_Ο†_lim

Β  Β  -- Step 4: Since T x_Ο†(n) β†’ y and β€–T x_Ο†(n) - ΞΌ x_Ο†(n)β€– β†’ 0, we have ΞΌ x_Ο†(n) β†’ y

Β  Β  -- So x_Ο†(n) β†’ y/ΞΌ (if ΞΌ β‰  0)

Β  Β  have h_smul_lim : Tendsto (fun n ↦ (ΞΌ : π•œ) β€’ x (Ο† n)) atTop (𝓝 y) := by

Β  Β  Β  -- T x_Ο†(n) - ΞΌ x_Ο†(n) β†’ 0, so ΞΌ x_Ο†(n) β†’ T x_Ο†(n) β†’ y

Β  Β  Β  have h1 : Tendsto (fun n ↦ T (x (Ο† n)) - (ΞΌ : π•œ) β€’ x (Ο† n)) atTop (𝓝 0) :=

Β  Β  Β  Β  tendsto_zero_iff_norm_tendsto_zero.mpr h_almost_eig

Β  Β  Β  have h2 := hTx_Ο†_lim.sub h1

Β  Β  Β  simp only [sub_zero] at h2

Β  Β  Β  -- h2 : Tendsto (fun n ↦ T (x (Ο† n)) - (T (x (Ο† n)) - ΞΌ β€’ x (Ο† n))) atTop (𝓝 y)

Β  Β  Β  -- which simplifies to Tendsto (fun n ↦ ΞΌ β€’ x (Ο† n)) atTop (𝓝 y)

Β  Β  Β  convert h2 using 1

Β  Β  Β  ext n

Β  Β  Β  simp only [sub_sub_cancel]

Β  Β  -- Case split: L = 0 or L β‰  0

Β  Β  by_cases hL_zero : L = 0

Β  Β  Β· -- If L = 0, then ΞΌ = Οƒ * 0 = 0 and T = 0 (since β€–Tβ€– = L = 0)

Β  Β  Β  -- But T β‰  0, contradiction

Β  Β  Β  exfalso

Β  Β  Β  have hT_zero : β€–Tβ€– = 0 := hL_eq_norm β–Έ hL_zero

Β  Β  Β  rw [opNorm_zero_iff] at hT_zero

Β  Β  Β  exact hT_ne hT_zero

Β  Β  Β· -- If L β‰  0, then ΞΌ β‰  0 and we can divide

Β  Β  Β  have hΞΌ_ne : (ΞΌ : π•œ) β‰  0 := by

Β  Β  Β  Β  simp only [ne_eq, RCLike.ofReal_eq_zero]

Β  Β  Β  Β  intro hΞΌ_zero

Β  Β  Β  Β  have : |ΞΌ| = 0 := abs_eq_zero.mpr hΞΌ_zero

Β  Β  Β  Β  rw [hΞΌ_abs] at this

Β  Β  Β  Β  exact hL_zero this

Β  Β  Β  -- x_Ο†(n) = (1/ΞΌ) β€’ (ΞΌ β€’ x_Ο†(n)) β†’ (1/ΞΌ) β€’ y

Β  Β  Β  have hx_lim : Tendsto (fun n ↦ x (Ο† n)) atTop (𝓝 ((ΞΌ : π•œ)⁻¹ β€’ y)) := by

Β  Β  Β  Β  have h := h_smul_lim.const_smul (ΞΌ : π•œ)⁻¹

Β  Β  Β  Β  simp only [smul_smul, inv_mul_cancelβ‚€ hΞΌ_ne, one_smul] at h

Β  Β  Β  Β  exact h

Β  Β  Β  -- Let e = (1/ΞΌ) β€’ y

Β  Β  Β  let e := (ΞΌ : π•œ)⁻¹ β€’ y

Β  Β  Β  -- e has unit norm by limit_of_unit_seq_is_unit

Β  Β  Β  have he_norm : β€–eβ€– = 1 := limit_of_unit_seq_is_unit hx_Ο†_norm hx_lim

Β  Β  Β  -- T e = ΞΌ e: pass to the limit in T x_Ο†(n) = ΞΌ x_Ο†(n) + error

Β  Β  Β  have hTe : T e = (ΞΌ : π•œ) β€’ e := by

Β  Β  Β  Β  -- T is continuous, so T x_Ο†(n) β†’ T e

Β  Β  Β  Β  have h_T_cont := T.continuous.tendsto e

Β  Β  Β  Β  have hT_of_lim : Tendsto (fun n ↦ T (x (Ο† n))) atTop (𝓝 (T e)) :=

Β  Β  Β  Β  Β  h_T_cont.comp hx_lim

Β  Β  Β  Β  -- But we also have T x_Ο†(n) β†’ y, so T e = y

Β  Β  Β  Β  have hTe_eq_y : T e = y := tendsto_nhds_unique hT_of_lim hTx_Ο†_lim

Β  Β  Β  Β  -- And ΞΌ e = ΞΌ β€’ (1/ΞΌ β€’ y) = y

Β  Β  Β  Β  have hΞΌe_eq_y : (ΞΌ : π•œ) β€’ e = y := by

Β  Β  Β  Β  Β  calc (ΞΌ : π•œ) β€’ e = (ΞΌ : π•œ) β€’ ((ΞΌ : π•œ)⁻¹ β€’ y) := rfl

Β  Β  Β  Β  Β  Β  _ = ((ΞΌ : π•œ) * (ΞΌ : π•œ)⁻¹) β€’ y := by rw [smul_smul]

Β  Β  Β  Β  Β  Β  _ = (1 : π•œ) β€’ y := by rw [mul_inv_cancelβ‚€ hΞΌ_ne]

Β  Β  Β  Β  Β  Β  _ = y := one_smul _ _

Β  Β  Β  Β  rw [hTe_eq_y, hΞΌe_eq_y]

      exact ⟨μ, e, he_norm, hTe, hμ_abs⟩

Β  -- Lemma C: existence of an eigenvector with eigenvalue of maximal absolute value.

Β  -- We need to construct a maximizing sequence where |Re βŸͺT x_n, x_n⟫| β†’ β€–Tβ€–

Β  have h_limit_exists : βˆƒ (ΞΌ : ℝ) (e : E), β€–eβ€– = 1 ∧ T e = (ΞΌ : π•œ) β€’ e ∧ |ΞΌ| = β€–Tβ€– := by

Β  Β  -- For self-adjoint operators, β€–Tβ€– = sup_{β€–xβ€–=1} |Re βŸͺT x, x⟫|

Β  Β  -- We can construct a sequence approaching this supremum

Β  Β  -- The set of Rayleigh quotient values is bounded above by β€–Tβ€–

Β  Β  let S : Set ℝ := Set.range (fun x : {x : E // β€–xβ€– = 1} ↦ |RCLike.re βŸͺT x, x⟫|)

Β  Β  have hS_nonempty : S.Nonempty := by

      obtain ⟨v, hv⟩ := hSphere_nonempty

Β  Β  Β  have hv_norm : β€–vβ€– = 1 := by

Β  Β  Β  Β  rw [Metric.mem_sphere, dist_zero_right] at hv

Β  Β  Β  Β  exact hv

Β  Β  Β  exact ⟨|RCLike.re βŸͺT v, v⟫|, ⟨⟨v, hv_norm⟩, rfl⟩⟩

Β  Β  have hS_bdd : BddAbove S := by

Β  Β  Β  use β€–Tβ€–

      intro r ⟨x, hx⟩

Β  Β  Β  rw [← hx]

Β  Β  Β  exact hRayleigh_bound x x.property

Β  Β  -- There exists a sequence approaching sSup S

    obtain ⟨u, hu_mono, hu_tendsto, hu_mem⟩ := exists_seq_tendsto_sSup hS_nonempty hS_bdd

Β  Β  -- Extract the actual unit vectors from the sequence

Β  Β  have h_unit_seq : βˆƒ (x : β„• β†’ E), (βˆ€ n, β€–x nβ€– = 1) ∧

Β  Β  Β  Β  Tendsto (fun n ↦ |RCLike.re βŸͺT (x n), x n⟫|) atTop (𝓝 (sSup S)) := by

Β  Β  Β  choose f hf using hu_mem

Β  Β  Β  use fun n ↦ (f n : E)

Β  Β  Β  constructor

Β  Β  Β  Β· intro n; exact (f n).property

Β  Β  Β  Β· convert hu_tendsto using 1

Β  Β  Β  Β  ext n

Β  Β  Β  Β  exact hf n

    obtain ⟨x, hx_norm, hx_lim⟩ := h_unit_seq

Β  Β  -- The key fact: sSup S = β€–Tβ€– for self-adjoint operators

Β  Β  -- This is the "Rayleigh quotient characterizes the norm" result

Β  Β  have hsSup_eq_norm : sSup S = β€–Tβ€– := by

Β  Β  Β  -- This is exactly the content of the local spectral-norm lemma above.

Β  Β  Β  -- The definition of S matches the set used there.

Β  Β  Β  have h := hOpNorm_eq_sSup_absRayleigh hSA

Β  Β  Β  simpa [S] using h

Β  Β  -- Now apply the helper lemma with L = β€–Tβ€–

Β  Β  have hL_nonneg : 0 ≀ β€–Tβ€– := norm_nonneg T

Β  Β  rw [hsSup_eq_norm] at hx_lim

Β  Β  exact exists_eigenvector_from_maximizing_seq x hx_norm β€–Tβ€– hL_nonneg rfl hx_lim

Β  exact h_limit_exists

Aaron Liu (Dec 13 2025 at 02:18):

This feels like way too long for a mathlib proof

Levi G (Dec 13 2025 at 07:02):

Aaron Liu said:

This feels like way too long for a mathlib proof

Thank you for your time spent looking at my proof for rayleigh quotient theorem for self-adjoint compact operators. Thank you for your concern. I get that it is too long for a mathlib proof however it plays a critical role in my spectral theorem proof. It does the heavy lifting in my spectral theorem where I use it as a lemma in proof by induction to get a base case then for the induction step in combination with the axiom of dependent choice to get a sequence of eigenvectors and eigenvalues.

Yan Yablonovskiy πŸ‡ΊπŸ‡¦ (Dec 13 2025 at 08:43):

Levi G said:

Aaron Liu said:

This feels like way too long for a mathlib proof

Thank you for your time spent looking at my proof for rayleigh quotient theorem for self-adjoint compact operators. Thank you for your concern. I get that it is too long for a mathlib proof however it plays a critical role in my spectral theorem proof. It does the heavy lifting in my spectral theorem where I use it as a lemma in proof by induction to get a base case then for the induction step in combination with the axiom of dependent choice to get a sequence of eigenvectors and eigenvalues.

I think what everyone might like, is for you to at least try 'golf' and split up the proof and shorten it massively before sharing (which requires some manual effort and lean comprehension). You can look at mathlib conventions and current code adjacent to the theory.

It is a high cognitive load to review ungolfed and unidiomatic code, if you really want someone to help you and since its volunteering time, you gotta put in a bit more work to make it easier.

Levi G (Dec 13 2025 at 09:25):

@Yan Yablonovskiy πŸ‡ΊπŸ‡¦ Noted. Thank for you the feedback.

Kevin Buzzard (Dec 13 2025 at 13:11):

Alternatively, given that someone else is also working on this and they are writing idiomatic code because they know something about Lean, it might be best just to wait for their work (as has been pointed out already) rather than just continuing to generate unidiomatic AI code (which kind of feels like a waste of time to me, given the situation, as it will never be reviewed or merged).

Niels Voss (Dec 14 2025 at 01:36):

Sorry that my collaborators and I have not had much time to work on this recently; we are currently studying for finals and trying to complete another project by January. We are however willing to discuss design decisions in the meantime. The current status of the PR is that #32126 defines the singular values; however I will not personally consider this PR complete until we have proven that for finite-dimensional inner product spaces the singular values as defined by the approximation numbers are equal to the singular values as defined by the square roots of eigenvalues of T*T. We need to prove this before completing the PR because otherwise we risk the definition being incorrect.

We haven't had time to look too much into it, but we currently believe that the proper progression to this proof is the Von Neumann minimax theorem (#31560) implies the min-max theorem which implies the min-max principle for singular values which in turn implies the theorem we want.

Also, I think Kim Morrison's comment above makes sense:

I love the plan, but I hope that if there is not an implementation plan (i.e. a person saying they are doing it in a time frame) that this would not get in the way of someone contributing an implementation for matrices, which there is demand for and has been for a while. I think it should suffice in an implementation for matrices to include a module doc link to this thread, giving a pointer to the proposed upgrade path.

So I don't want to discourage people from working on the matrix SVD. However, I have very specific ideas for how the matrix SVD should be set up to maximize generality (see for instance my comment about merging the Full, Thin, Compact, and Truncated SVDs into a single definition). In particular, SVD has an above-average difficulty of even stating the theorem statement idiomatically in Lean, and there have been previous SVD formalizations which have been rejected simply due to unidiomatic definitions (e.g. #6042). Translating a proof that uses unidiomatic definitions to a proof that uses idiomatic definitions can actually be harder than starting from idiomatic definitions in the first place, which means that a proof with unidiomatic definitions might not count as any progress at all towards getting something into mathlib.

So I would request that anyone interested in working on the matrix SVD and getting it into mathlib discuss the theorem statement they intend to formalize along with all new definitions before attempting to work on the proof.

Levi G (Dec 14 2025 at 15:34):

Jireh Loreaux said:

Levi G the fact that it's a big file is part of the problem. At this point, you don't have experience contributing to Mathlib, so a contribution of this magnitude is very hard to review. If you start with smaller projects and learn good design patterns from receiving review, then it gets easier because your code is higher quality. Let me highlight one example as evidence:

theorem IsCompactOperator.eigenvalue_tendsto_zero
    (hT : IsCompactOperator T) (_hSA : IsSelfAdjoint T)
    {ΞΌ : β„• β†’ ℝ} {e : β„• β†’ E}
    (he_orth : Orthonormal π•œ e)
    (he_eigen : βˆ€ n, T (e n) = (ΞΌ n : π•œ) β€’ (e n))
    (hΞΌ_antitone : Antitone (fun n ↦ |ΞΌ n|)) :
    Tendsto (fun n ↦ |ΞΌ n|) atTop (𝓝 0) := by

This theorem has nothing to do with selfadjiont operators, or real eigenvalues, or Antitone in any fundamental way. That is, what's really going on here is that since e is orthonormal, is converges weakly to zero, and compact operators take weakly convergent sequences to norm convergent ones.

Again, I would love for you to become a productive member of the community, but please start with smaller projects. It may even be the case that there are peices of your code worth salvaging, but I don't have the energy or the time to try to sift through 2000 lines of code looking for a diamond in the rough.

Does this address the issue in that example? It's part of my new code that I'll post later.

import Mathlib.Analysis.InnerProductSpace.Basic
import Mathlib.Analysis.Normed.Operator.Compact
import Mathlib.Topology.Algebra.InfiniteSum.Basic
import Mathlib.Analysis.InnerProductSpace.Adjoint

/-!
# Weak Convergence and Compact Operators

This file proves that eigenvalues of compact operators corresponding to orthonormal
sequences of eigenvectors tend to zero.

## Main results

* `Orthonormal.tendsto_inner_zero`: Orthonormal sequences converge weakly to zero.
* `IsCompactOperator.tendsto_of_tendsto_weakly_zero`: Compact operators map weakly
  convergent sequences to norm convergent sequences.
* `IsCompactOperator.eigenvalue_tendsto_zero`: The sequence of eigenvalues of a compact
  operator on a Hilbert space tends to zero (for orthonormal eigenvectors).
-/

open Filter Topology

variable {π•œ : Type*} [RCLike π•œ]
variable {E : Type*} [NormedAddCommGroup E] [InnerProductSpace π•œ E]

local notation "βŸͺ" x ", " y "⟫" => @inner π•œ E _ x y

/-- **Orthonormal Sequences Converge Weakly to Zero** (Bessel's Inequality Consequence)

For any orthonormal sequence `(eβ‚™)` and any vector `x`, we have `βŸͺx, eβ‚™βŸ« β†’ 0` as `n β†’ ∞`.
This follows from Bessel's inequality: `βˆ‘ |βŸͺx, eβ‚™βŸ«|Β² ≀ β€–xβ€–Β²`, which implies the terms
of the sum must tend to zero.
-/
theorem Orthonormal.tendsto_inner_zero {e : β„• β†’ E} (he : Orthonormal π•œ e) (x : E) :
    Tendsto (fun n ↦ βŸͺx, e n⟫) atTop (𝓝 0) := by
  have h_summable : Summable (fun n ↦ β€–βŸͺe n, xβŸ«β€–^2) := he.inner_products_summable x
  have h_sq_tendsto := h_summable.tendsto_atTop_zero
  have h_norm_tendsto : Tendsto (fun n ↦ β€–βŸͺe n, xβŸ«β€–) atTop (𝓝 0) := by
    have h := h_sq_tendsto.sqrt
    simp only [Real.sqrt_sq (norm_nonneg _), Real.sqrt_zero] at h
    exact h
  have h_norm_swap : (fun n ↦ β€–βŸͺx, e nβŸ«β€–) = (fun n ↦ β€–βŸͺe n, xβŸ«β€–) := by
    funext n
    simp [norm_inner_symm]
  have h_norm_tendsto' : Tendsto (fun n ↦ β€–βŸͺx, e nβŸ«β€–) atTop (𝓝 0) :=
    h_norm_swap β–Έ h_norm_tendsto
  exact tendsto_zero_iff_norm_tendsto_zero.mpr h_norm_tendsto'

variable [CompleteSpace E]

/-- **Compact Operators are Weak-to-Norm Continuous on Bounded Sets**

If `T` is a compact operator and `(xβ‚™)` is a bounded sequence converging weakly to `0`,
then `T(xβ‚™) β†’ 0` in norm.
-/
theorem IsCompactOperator.tendsto_of_tendsto_weakly_zero
    {T : E β†’L[π•œ] E} (hT : IsCompactOperator T) {x : β„• β†’ E}
    (hx_bdd : βˆƒ M : ℝ, βˆ€ n, β€–x nβ€– ≀ M)
    (hx_weak : βˆ€ y : E, Tendsto (fun n ↦ βŸͺy, x n⟫) atTop (𝓝 0)) :
    Tendsto (fun n ↦ T (x n)) atTop (𝓝 0) := by
  obtain ⟨M, hM⟩ := hx_bdd
  rcases hT.image_closedBall_subset_compact (f := (T : E β†’β‚›β‚—[RingHom.id π•œ] E)) (M + 1) with
    ⟨K, hK_compact, hTK⟩
  have h_in_K : βˆ€ n, T (x n) ∈ K := fun n ↦ by
    apply hTK
    refine ⟨x n, ?_, rfl⟩
    simp only [Metric.mem_closedBall, dist_zero_right]
    calc β€–x nβ€– ≀ M := hM n
      _ ≀ M + 1 := by linarith
  rw [Metric.tendsto_atTop]
  intro Ξ΅ hΞ΅
  classical
  by_contra h_not_eventually
  have h_not_ev_lt : Β¬ βˆ€αΆ  n in atTop, β€–T (x n)β€– < Ξ΅ := by
    intro h_ev
    rcases (eventually_atTop.1 h_ev) with ⟨N, hN⟩
    exact h_not_eventually ⟨N, by intro n hn; simpa [dist_zero_right] using hN n hn⟩
  have h_freq : βˆƒαΆ  n in atTop, Ξ΅ ≀ β€–T (x n)β€– :=
    (not_eventually.1 h_not_ev_lt).mono (by intro n hn; exact le_of_not_gt hn)
  obtain βŸ¨Ο†, hΟ†_mono, hΟ†_prop⟩ := extraction_of_frequently_atTop h_freq
  have h_in_K' : βˆ€ n, T (x (Ο† n)) ∈ K := fun n ↦ h_in_K (Ο† n)
  obtain ⟨y, _, ψ, hψ_mono, hTx_lim⟩ := hK_compact.tendsto_subseq h_in_K'
  have hy_zero : y = 0 := by
    have h_weak_y : βˆ€ z : E, βŸͺz, y⟫ = 0 := fun z ↦ by
      have h1 : Tendsto (fun n ↦ βŸͺz, T (x (Ο† (ψ n)))⟫) atTop (𝓝 βŸͺz, y⟫) :=
        (Filter.Tendsto.inner tendsto_const_nhds hTx_lim)
      have h2 : Tendsto (fun n ↦ βŸͺz, T (x (Ο† (ψ n)))⟫) atTop (𝓝 0) := by
        have h_adj : Tendsto (fun n ↦ βŸͺ(ContinuousLinearMap.adjoint T) z, x (Ο† (ψ n))⟫) atTop (𝓝 0) :=
          (hx_weak ((ContinuousLinearMap.adjoint T) z)).comp ((hΟ†_mono.comp hψ_mono).tendsto_atTop)
        simpa [ContinuousLinearMap.adjoint_inner_left] using h_adj
      exact tendsto_nhds_unique h1 h2
    exact inner_self_eq_zero.mp (h_weak_y y)
  have h_norm_lim : Tendsto (fun n ↦ β€–T (x (Ο† (ψ n)))β€–) atTop (𝓝 0) := by
    have hTx_zero : Tendsto (fun n ↦ T (x (Ο† (ψ n)))) atTop (𝓝 0) := by simpa [hy_zero] using hTx_lim
    simpa [Function.comp, norm_zero] using (continuous_norm.tendsto (0 : E)).comp hTx_zero
  have h_ge_Ξ΅ : βˆ€ n, Ξ΅ ≀ β€–T (x (Ο† (ψ n)))β€– := fun n ↦ hΟ†_prop (ψ n)
  have h_eventually_small : βˆ€αΆ  n in atTop, β€–T (x (Ο† (ψ n)))β€– < Ξ΅ := by
    rw [Metric.tendsto_atTop] at h_norm_lim
    obtain ⟨N, hN⟩ := h_norm_lim Ρ hΡ
    filter_upwards [Filter.Ici_mem_atTop N] with n hn
    simpa [dist_zero_right] using hN n hn
  obtain ⟨n, hn⟩ := h_eventually_small.exists
  exact (lt_of_lt_of_le hn (h_ge_Ξ΅ n)).false

/-- Eigenvalues of a compact operator decay to zero along an orthonormal eigenbasis. -/
theorem IsCompactOperator.eigenvalue_tendsto_zero
    {T : E β†’L[π•œ] E} (hT : IsCompactOperator T)
    {ΞΌ : β„• β†’ π•œ} {e : β„• β†’ E}
    (he_orth : Orthonormal π•œ e)
    (he_eigen : βˆ€ n, T (e n) = ΞΌ n β€’ e n) :
    Tendsto (fun n ↦ β€–ΞΌ nβ€–) atTop (𝓝 0) := by
  have h_weak : βˆ€ y : E, Tendsto (fun n ↦ βŸͺy, e n⟫) atTop (𝓝 0) :=
    fun y ↦ he_orth.tendsto_inner_zero y
  have h_bdd : βˆƒ M : ℝ, βˆ€ n, β€–e nβ€– ≀ M := ⟨1, fun n ↦ le_of_eq (he_orth.norm_eq_one n)⟩
  have h_Te_zero : Tendsto (fun n ↦ T (e n)) atTop (𝓝 0) :=
    hT.tendsto_of_tendsto_weakly_zero h_bdd h_weak
  have h_norm_eq : βˆ€ n, β€–ΞΌ nβ€– = β€–T (e n)β€– := fun n ↦ by
    rw [he_eigen n, norm_smul, he_orth.norm_eq_one, mul_one]
  simpa [h_norm_eq] using (continuous_norm.tendsto (0 : E)).comp h_Te_zero

Michael Rothgang (Dec 14 2025 at 16:29):

@Levi G In the interest of using my time well, I will not be reviewing your code on this matter, for all the reasons already mentioned up-thread. (Other maintainers may or may not do the same.) Make of that what you will, but I think it's fair for you to know.


Last updated: Dec 20 2025 at 21:32 UTC