Singular Value Decomposition

Made by Pranay Venkatesh, April 2025

using LinearAlgebra

These are all fairly easy to describe with a Julia code

A = [1. 0. 0. 0. 2.; 0. 0. 3. 0. 0.; 0. 3. 6. 0. 0.; 0. 2. 0. 0. 0.]
F = svd(A)
SVD{Float64, Float64, Matrix{Float64}, Vector{Float64}}
U factor:
4×4 Matrix{Float64}:
 0.0       1.0   0.0        0.0
 0.375623  0.0  -0.554241   0.742781
 0.919925  0.0   0.125726  -0.371391
 0.112453  0.0   0.822806   0.557086
singular values:
4-element Vector{Float64}:
 7.285821103869117
 2.23606797749979
 2.217388293108676
 0.0
Vt factor:
4×5 Matrix{Float64}:
 -0.0       0.409656   0.91224   -0.0  0.0
  0.447214  0.0        0.0        0.0  0.894427
 -0.0       0.91224   -0.409656  -0.0  0.0
  0.0       0.0        0.0        1.0  0.0
Aleft = A * transpose(A)
Aright = transpose(A) * A
5×5 Matrix{Float64}:
 1.0   0.0   0.0  0.0  2.0
 0.0  13.0  18.0  0.0  0.0
 0.0  18.0  45.0  0.0  0.0
 0.0   0.0   0.0  0.0  0.0
 2.0   0.0   0.0  0.0  4.0
eigen(Aleft)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
4-element Vector{Float64}:
 -3.2656641468096717e-305
  4.916810842415401
  5.0
 53.0831891575846
vectors:
4×4 Matrix{Float64}:
  0.0        0.0       1.0  0.0
  0.742781   0.554241  0.0  0.375623
 -0.371391  -0.125726  0.0  0.919925
  0.557086  -0.822806  0.0  0.112453
eigen(Aright)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
5-element Vector{Float64}:
  0.0
  1.7763568394002505e-15
  4.916810842415423
  5.0
 53.08318915758459
vectors:
5×5 Matrix{Float64}:
  0.0  -0.894427   0.0       0.447214  0.0
  0.0   0.0        0.91224   0.0       0.409656
  0.0   0.0       -0.409656  0.0       0.91224
 -1.0   0.0        0.0       0.0       0.0
  0.0   0.447214   0.0       0.894427  0.0
(F.S).^2
4-element Vector{Float64}:
 53.0831891575846
  5.000000000000001
  4.9168108424154084
  0.0

Why SVDs?

  • There are several ways to factorise matrices to use them to do various things, but SVDs are particularly handy since the truncated SVD of rank k is the best approximation (of rank k) to the original matrix A. This is called the Eckart-Young theorem.
  • If \(A = U \Sigma V^{\dagger} = \sigma_1 u_1 v_1^{\dagger} + \sigma_2 u_2 v_2^{\dagger} + ... \sigma_r u_r v_r^{\dagger}\) is a rank r matrix, then the rank k truncation is given by \(A_k = U_k \Sigma_k V_k^{\dagger} = \sigma_1 u_1 v_1^{\dagger} + \sigma_2 u_2 v_2^{\dagger} + ... \sigma_k u_k v_k^{\dagger}\) where of course k < r.
  • NOTE: when we make this truncation, we sort the singular values in descending order, so that we only drop terms with the lowest singular values or the lowest importance.
  • More formally we use the Froebenius norm, defined by \(||A||_F = \sum\limits_{i, j} |A_{i,j}|^2\) to say that given a matrix A and the SVD representation of rank k given by \(A_k\), then any matrix B of rank k would have the property \(||A - A_k||_F \leq ||A - B||_F\) or that \(A_k\) is the closer to \(A\) than any matrix B of rank k.
  • This is nice to look at with a julia code
# SVD truncation 

function tsvd(A::Matrix, k::Int)
    F = svd(A)
    U, S, V = F.U, F.S, F.V
    U_k = U[:, 1:k]
    S_k = S[1:k]
    Vt_k = V[:, 1:k]'
    A_k = U_k * Diagonal(S_k) * Vt_k
    
    return U_k, S_k, Vt_k, A_k
end

u, s, v, a = tsvd(A, 2) # Rank 2 truncation

norm(A - a, 2)
2.2173882931086766
U = randn(4, 2)
V = randn(2, 5)

B = U * V
print(rank(B))
norm(A - B, 2)
2
9.538712887748947

You can keep running the last block of code and you will find that the norm(A - B) will never go below norm(A - a), showing that a is as close as you will get to A. Note that here B is constructed to be a random rank 2 matrix, but the code doesn’t stop the voodoo possibility of randomly getting linearly dependent rows or columns in a random construction.

  • SVD is particularly handy for the construction and propagation of matrix product states (MPS) which is what led me to come back and revisit some of the beautiful matrix math that underlies these methods.

  • SVD is also a great way to compute the null-space of a matrix. The columnvectors of V corresponding to singular value = 0 are the nullspace of the matrix A. For example, for a Markov model \(\frac{dp}{dt} = A p\), you can find the final steady-state distribution by computing the null-space of A using an SVD.

QR Decomposition

  • The QR decomposition is another matrix factorisation that I learned from Gilbert Strang rather than my (highly inadequate) undergraduate linear algebra class.
  • Any square matrix A can be decomposed into Q times R where Q is an orthogonal matrix and R is an upper triangular matrix. So why would we want to do this? Because this is a useful way to compute the eigenvalues for a hermitian matrix.
  • Start by noting : If you write \(A_0 = Q_0 R_0\) and then you get a new matrix \(A_1 = R_0 Q_0\) (order swapped), then \(A_1\) and \(A_0\) have the same eigenvalues since \(A_1 = R_0 A_0 R_0^{-1}\) (they are related by a similarity).
  • If you keep repeating this procedure of QR decomposition followed by swapping the matrix orders, it turns out that the diagonal elements of \(A_n\) become the eigenvalues pretty quickly since the off-diagonals become small (very fast).
  • QR is faster on GPUs than SVDs and hence you can also do a QR decomposition when using MPS (there’s a Frank Pollman paper on Arxiv that does this)
qr(A)
LinearAlgebra.QRCompactWY{Float64, Matrix{Float64}, Matrix{Float64}}
Q factor: 4×4 LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}
R factor:
4×5 Matrix{Float64}:
 1.0   0.0       0.0      0.0  2.0
 0.0  -3.60555  -4.9923   0.0  0.0
 0.0   0.0       4.48073  0.0  0.0
 0.0   0.0       0.0      0.0  0.0

Relevant Reading:

  1. Strang 18.06, 18.065 (more computational : this is the course that got me introduced to julia )
  2. Numerical Recipes in C : goldmine for computational things
  3. QR in MPS