Zulip Chat Archive

Stream: PR reviews

Topic: PR #6042 Singular Value Decomposition

MohanadAhmed (Jul 21 2023 at 21:19):

PR #6042 contains definitions and proofs for Singular Value Decompositon. PR #6042

MohanadAhmed (Sep 28 2023 at 12:19):

Hello Mathlib Reviewer Team,

I received this comment

We've been discussing this PR with the mathlib reviewer team, and I'm afraid that we concluded that this result still needs quite some work. Namely, we find it important to get the decomposition also for homomorphisms (specifically, elements of Module.End), and it seems better to first prove SVD for homomorphisms, and then derive the matrix result from that. Basically, the same approach as we already have for eigenvalues in mathlib.

I do not understand what is required? Or whether this is a rejection of the PR ? Any thoughts are welcome

Jireh Loreaux (Sep 28 2023 at 15:05):

It is saying that the PR is not being accepted as is. This is not because it is poorly written or anything of that nature, but rather that we feel it doesn't match the design choices of mathlib. In this case, that means the result should first be proven in the basis-free setting of linear maps (note: Anne was slightly incorrect here, it's not specialized to Module.End), and then specialize to matrices. The same approach is taken for eigenvalues (where it really is Module.End).

So, what is required, as a first step, is that you prove SVD for linear maps (not using the matrix version you've already established). One statement of this would be: Given a T : V →ₗ[𝕜] W linear map between finite dimensional IsRorC 𝕜-inner product spaces, there exists an antitone map σ : Fin (rank T) → ℝ≥0 and docs#Orthonormal collections v : Fin (rank T) → V, w : Fin (rank T) → W such that for all i : Fin (rank T), T (v i) = (σ i : 𝕜) • (w i), and for any v' : V, if for all i : FIn (rank T), ⟪v i, v'⟫ = 0, then T v' = 0. Alternatively, this second condition could be phrased in a slightly different but provably equivalent way, like: (Submodule.span 𝕜 (Set.range v)).orthogonal ≤ T.ker

Jireh Loreaux (Sep 28 2023 at 15:06):

I don't claim that the suggested version above is necessarily the best basis-free version, but its the kind of thing we're looking for.

Jireh Loreaux (Sep 28 2023 at 15:22):

Here is a formalized statement (again, not necessarily the best; probably there should be discussion in this thread about what exactly is the best statement to formalize):

import Mathlib.LinearAlgebra.Matrix.PosDef

variable {𝕜 V W : Type*} [IsROrC 𝕜]
variable [NormedAddCommGroup V] [InnerProductSpace 𝕜 V] [FiniteDimensional 𝕜 V]
variable [NormedAddCommGroup W] [InnerProductSpace 𝕜 W] [FiniteDimensional 𝕜 W]

open LinearMap Submodule NNReal

theorem LinearMap.singular_value_decomposition (T : V →ₗ[𝕜] W) :
    ∃ r : ℕ, r = T.rank ∧ ∃ σ : Fin r → ℝ≥0, Antitone σ ∧ ∃ v : Fin r → V, ∃ w : Fin r → W,
      Orthonormal 𝕜 v ∧ Orthonormal 𝕜 w ∧ (∀ i, T (v i) = ((σ i : ℝ) : 𝕜) • (w i)) ∧
      (span 𝕜 (Set.range v)).orthogonal ≤ ker T := sorry

Jireh Loreaux (Sep 28 2023 at 15:24):

@Heather Macbeth , @Eric Wieser, @Anne Baanen :up: What do you think is the correct version of SVD to state for linear maps? I took a stab at it above, but we should probably come up with the precise version of what we want for Monahad.

Alex J. Best (Sep 28 2023 at 16:14):

Jireh Loreaux said:

So, what is required, as a first step, is that you prove SVD for linear maps (not using the matrix version you've already established).

what's wrong with using the matrix version to prove the ~~basis-free~~ linear map one? Are there particular intermediate results you think are also useful in a basis free setting?

Eric Wieser (Sep 28 2023 at 16:23):

My reading of MohanadAhmed's matrix version is that they would prefer there to be a structural distinction between the zero and non-zero entries of σ, and the corresponding entries of your v/ w (which I think would be less confusing as u/v)

Eric Wieser (Sep 28 2023 at 16:24):

Ah, looking again; it's not that your version doesn't make this distinction, it's that it doesn't extend the orthonormal basis to the whole space

Jireh Loreaux (Sep 28 2023 at 16:26):

Right, you can always extend, but this also gives you the reduced SVD for free (I can't remember what that's called off the top of my head).

Eric Wieser (Sep 28 2023 at 16:33):

Wikipedia suggests "compact SVD"

Eric Wieser (Sep 28 2023 at 16:37):

Certainly deciding between the compact or "full" SVD is an important decision in either the matrix or linear map case, and I'm not sure there was any discussion about it.

Heather Macbeth (Sep 28 2023 at 18:42):

How about: a linear map $V\to W$ can be composed as an isometry $\mathbb{R}^\iota\to W$ , the adjoint of an isometry $\mathbb{R}^\iota\to V$ , and a strictly positive diagonal map on $\mathbb{R}^\iota$ .

Note: like Jireh I stated it for IsROrC 𝕜, but I don't know whether it's true for ℂ.

import Mathlib.Analysis.InnerProductSpace.Spectrum

variable {𝕜 V W : Type*} [IsROrC 𝕜]
variable [NormedAddCommGroup V] [InnerProductSpace 𝕜 V] [FiniteDimensional 𝕜 V]
variable [NormedAddCommGroup W] [InnerProductSpace 𝕜 W] [FiniteDimensional 𝕜 W]

open LinearMap Matrix NNReal

-- once this exists, some of `OrthonormalBasis` should be re-routed through it
def Orthonormal.toIsometry {ι : Type*} [Fintype ι] (v : ι → V) (hv : Orthonormal 𝕜 v) :
    EuclideanSpace 𝕜 ι →ₗ[𝕜] V :=
  sorry

theorem LinearMap.singular_value_decomposition (T : V →ₗ[𝕜] W) :
    ∃ (ι : Type*) (_ : Fintype ι) (_ : DecidableEq ι) (σ : ι → ℝ) (hσ : ∀ i, σ i > 0)
    (v : ι → V) (w : ι → W) (hv : Orthonormal 𝕜 v) (hw : Orthonormal 𝕜 w),
    T = hw.toIsometry ∘ₗ toLin' (diagonal ((↑) ∘ σ)) ∘ₗ adjoint hv.toIsometry :=
  sorry

Sebastien Gouezel (Sep 28 2023 at 18:45):

Singular values should definitely be ordered from the largest one to the smallest one.

Heather Macbeth (Sep 28 2023 at 18:45):

Indeed, in @Jireh Loreaux's version there also appears the fact that ι can be taken to be Fin k and σ then to be antitone, but this is secondary, in my opinion, and could appear as a separate lemma proved by docs#Tuple.sort.

Heather Macbeth (Sep 28 2023 at 18:48):

So that version would look like

theorem LinearMap.singular_value_decomposition (T : V →ₗ[𝕜] W) :
    ∃ (r : ℕ) (σ : Fin r → ℝ) (hσ : ∀ i, σ i > 0) (v : Fin r → V)
    (w : Fin r → W) (hv : Orthonormal 𝕜 v) (hw : Orthonormal 𝕜 w), Antitone σ ∧
    T = hw.toIsometry ∘ₗ toLin' (diagonal ((↑) ∘ σ)) ∘ₗ adjoint hv.toIsometry :=
  sorry

Jireh Loreaux (Sep 28 2023 at 18:49):

To be honest, in my version there is something that annoys me: we should be saying exactly what σ is (i.e., square roots of eigenvalues of T.adjoint * T listed in nonincreasing order), but I don't think we can do this yet because we don't have enough material about positive operators (my fault!).

Jireh Loreaux (Sep 28 2023 at 18:50):

"can't" is maybe too strong a word, I just mean, "probably not in a nice way"

Heather Macbeth (Sep 28 2023 at 18:51):

Hmm -- don't you think this would follow from the spectral theorem we have in mathlib?

Jireh Loreaux (Sep 28 2023 at 18:52):

which one?

Heather Macbeth (Sep 28 2023 at 18:53):

docs#LinearMap.IsSymmetric.hasEigenvector_eigenvectorBasis or one of the other ones in that file?

Jireh Loreaux (Sep 28 2023 at 18:56):

Aha, that probably wouldn't be too bad. I guess I want a constructor which takes a positive operator and lists the eigenvalues as terms of ℝ≥0 in decreasing order. Then we would just use that in the above.

Heather Macbeth (Sep 28 2023 at 18:56):

I wrote (or rather, got Kyle to write) docs#Tuple.sort precisely for that last part (the ordering), but never glued it into the spectral theorem.

Jireh Loreaux (Sep 28 2023 at 18:58):

By the way, I think we probably want decreasing order, instead of increasing order as suggested by the docstring.

Jireh Loreaux (Sep 28 2023 at 18:58):

I have to go teach now.

Jireh Loreaux (Sep 28 2023 at 19:00):

Oh but before I forget: yes, I prefer your Orthonormal.toIsometry version.

Antoine Chambert-Loir (Sep 29 2023 at 09:01):

Also: there is a SVD theorem for linear maps between two distinct vector spaces.

Eric Wieser (Sep 29 2023 at 09:12):

@Antoine Chambert-Loir, I'm confused by that comment, as both Jireh Loreaux and Heather Macbeth's versions above use T : V →ₗ[𝕜] W already.

Eric Wieser (Sep 29 2023 at 09:12):

Perhaps this is just in response to the incorrect Module.End claim made in the top post

Antoine Chambert-Loir (Sep 29 2023 at 11:23):

Yes, and I misunderstood another explicit statement. So it's my confusion, sorry…

MohanadAhmed (Sep 30 2023 at 13:43):

Hello everyone,
Thanks a lot for the lively discussion.

MohanadAhmed (Sep 30 2023 at 13:54):

Sorting of Eigenvalues

Heather Macbeth said:

I wrote (or rather, got Kyle to write) docs#Tuple.sort precisely for that last part (the ordering), but never glued it into the spectral theorem.

@Heather Macbeth
I started on the sorting of eigenvalues in PR #7427. You will find I have added next to every lemma a lemma starting with the letter x that uses the ordered eigenvalues. The PR now builds (minus the linter asking for a docstring), but I think there is a question here that needs to be answered:

Sorting requires a LinearOrder on the domain. Some of the linear map lemmas like docs#PiLp.basis_toMatrix_basisFun_mul and docs#Basis.basis_toMatrix_basisFun_mul are written with arbitrary unordered finite types. For example:

nonrec theorem basis_toMatrix_basisFun_mul (b : Basis ι 𝕜 (PiLp p fun _ : ι => 𝕜))
    (A : Matrix ι ι 𝕜) :
    b.toMatrix (PiLp.basisFun _ _ _) * A =
      Matrix.of fun i j => b.repr ((WithLp.equiv _ _).symm (Aᵀ j)) i := by

which I modified to:

nonrec theorem xbasis_toMatrix_basisFun_mul
    (b : Basis (Fin (Fintype.card ι)) 𝕜 (PiLp p fun _ : ι => 𝕜)) (A : Matrix ι ι 𝕜) :
    b.toMatrix (PiLp.basisFun _ _ _) * A =
      Matrix.of fun i j => b.repr ((WithLp.equiv _ _).symm (Aᵀ j)) i := by

What is the desired solution?

MohanadAhmed (Sep 30 2023 at 13:57):

Order of Singular Values and Eigenvalues (Antitone vs. Monotone)

I prefer the order to be decreasing. I implemented the note as is i.e. increasing order. My preference is purely because that is how I learned them.

(And slightly because it makes it easier to define a function from Fin R to \Reals for non-zero singular values).

Are there other reasons for the decreasing order: @Sebastien Gouezel , @Jireh Loreaux

MohanadAhmed (Sep 30 2023 at 14:15):

Compact vs Full SVD

Eric Wieser said:

Wikipedia suggests "compact SVD"

The partitioning between zero and nonzero eigenvalues and corresponding partition in the left/right singular vector matrices. Makes this distinction a minor issue. The compact version is easily derivable from the full version as follows:

theorem full_svd_theroem (A : Matrix (Fin M) (Fin N) 𝕂) :
    A.svdU * A.svdξ * A.svdVᴴ = A := by sorry -- about 5 lines here

lemma svd_compact (A : Matrix (Fin M) (Fin N) 𝕂) :
    A = A.svdU₁ * (map A.svdσ (algebraMap ℝ 𝕂)) * A.svdV₁ᴴ := by
  nth_rw 1 [← full_svd_theroem A]
  rw [ svdU, svdξ, svdV, fromColumns_mul_fromBlocks,
    conjTranspose_fromColumns_eq_fromRows_conjTranspose, fromColumns_mul_fromRows]
  simp

Sebastien Gouezel (Sep 30 2023 at 14:20):

I agree that the order should be decreasing, because the most interesting singular value is the largest one.

MohanadAhmed (Sep 30 2023 at 14:28):

One can argue the largest singular value will still be most interesting, even in the last position i.e. at (n-1) position instead of 0 -position :grinning_face_with_smiling_eyes:

By the way I recall the names svdU, svdU₁, svdσ, svdV, ... etc were not descriptive enough. Any particular suggestion on what names to use?

MohanadAhmed (Sep 30 2023 at 14:30):

Alex J. Best said:

Jireh Loreaux said:

So, what is required, as a first step, is that you prove SVD for linear maps (not using the matrix version you've already established).

what's wrong with using the matrix version to prove the ~~basis-free~~ linear map one? Are there particular intermediate results you think are also useful in a basis free setting?

@Jireh Loreaux
I also have the same question as Alex here.

Jireh Loreaux (Oct 01 2023 at 21:26):

The other reason to prefer decreasing: in the infinite dimensional setting, for compact operators, it is the only available ordering (at least if the index set is \N). This is my primary motivation to prefer decreasing, but Sebastien's is also sensible.

Jireh Loreaux (Oct 01 2023 at 21:43):

It's nearly always preferable to go from the general theory of linear maps and then coordinatize only when necessary. The matrix results should fall out almost for free after the general version, but I expect the other direction isn't quite as easy. And note that we definitely want the linear map version.

I expect the linear map version to be easier to prove and shorter (no guarantees of course, and if I'm wrong then this would be an argument against this suggestion). Finally, it simply fits the design of the library; we prove things first for general linear maps, then specialize to matrices.

As for intermediate results, I'm not sure I have any off the top of my head, but I might be able to think of some.

Anne Baanen (Dec 12 2023 at 09:22):

Can I ask (cc @Yaël Dillies) what the status is of this PR now? Should I mark it as work in progress, to be replaced with the linear map based SVD?

Jireh Loreaux (Feb 13 2024 at 19:00):

@MohanadAhmed in case you didn't see the message on the PR, let's discuss a plan for this PR in this thread, if you want to continue it.

Last updated: May 02 2025 at 03:31 UTC