Math Genius: Do Jordan chains single out special directions of an eigenspace?

Original Source Link

Let $T$ be a linear operator on $mathbb C^n$ where $n<infty$ and let $U$ be the eigenspace associated with eigenvalue $lambda$. For simplicity, assume $lambda$ is the only eigenvalue of $T$.

Let $(v, (T-lambda I)v, dots,(T-lambda I)^{k-1}v)$ be a Jordan chain of length at least two. Let $w=(T-lambda I)^{k-1}v$ and $W=operatorname{span}(w)$. Then does every Jordan basis for $T$ contain a vector in $W$? This is the sense in which I mean that the Jordan chain “singles out” the subspace $Wsubseteq U$ as mentioned in the question title.

Tagged : / / / /

Math Genius: Proof that transposed matrix can be used for ZCA

Original Source Link

I am using Zero-phase Component Analysis (ZCA) for the processing of images. The images are provided as a matrix $X$ with $m=$ number of rows = number of images and $n=$ number of features (pixels) = number of columns. The ZCA formula is
$X_{ZCA} = U.diag(frac{1}{sqrt{diag(S)+epsilon}}).U^T.X$

where $U$ is the eigenvector matrix and $S$ the eigenvalue matrix from the singular value decomposition of X:

$U,S,V = svd(X^TX)$

A minimal ZCA algorithm written in Python looks like this:

from keras.datasets import cifar10
import numpy as np
from scipy import linalg

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X = X_train[:1000]
flat_x = np.reshape(x, (x.shape[0], x.shape[1] * x.shape[2] * x.shape[3]))

# normalize x:
flat_x = flat_x / 255.
flat_x = flat_x - flat_x.mean(axis=0)
#calculating the covariance matrix sigma
sigma = np.dot(flat_x.T, flat_x) / flat_x.shape[0]

u, s, _ = linalg.svd(sigma)
s_inv = 1. / np.sqrt(s[np.newaxis] + 0.1) # the 0.1 is the epsilon value for zca
principal_components = (u * s_inv).dot(u.T)

whitex = np.dot(flat_x, principal_components)

The code works fine, but if for $X$ $n>m$, then the svd calculation takes a lot of time, which is completely normal. But I discovered, if I rotate/transpose $X$ in 2 lines in the code I save a lot of time AND the result is very similar to the exact solution. More precisely the changed lines are:

sigma = np.dot(flat_x, flat_x.T) / flat_x.shape[0]

whitex= np.dot(flat_x.T, principal_components)

corresponding mathematically to

$U_2,S_2,V_2 = svd(XX^T)$

$X_{ZCA2} = U_2.diag(frac{1}{sqrt{diag(S_2)+epsilon}}).{U_2}^T.X^T$

$X_{ZCA}approx X_{ZCA2}$

An empiric, visual proof that $X_{ZCA}approx X_{ZCA2}$ is the image I created below – the original image is on the left, in the middle is the exact ZCA calculation result and on the right is the approximated solution from $X_{ZCA2}$:

enter image description here

My question: What is the reason for $X_{ZCA}approx X_{ZCA2}$? Is there a mathematical proof for it?

I have found so far the hint (slide 30) that if $v$ is an eigenvector to $XX^T$ then $Xv$ must be an eigenvector to $X^TX$, but I can’t derive from it a proof. Thanks!

Tagged : / / /

Making Game: What is finite precision arithmetic and how does it affect SVD when computed by computers?

Original Source Link

Was reading the paper “DETECTING AND ASSESSING THE PROBLEMS CAUSED BY MULTICOLLINEARITY:A USE OF THE SINGULAR-VALUE DECOMPOSITION” by David Belsley and Virginia Klema.

After performing SVD, while counting the number of non-zero singular values, it is stated in the paper that

problems arise because computers use finite arithmetic …

More specifically, eigenvalues that are supposed to be zero are stored as non-zero eigenvalues due to arithmetic precision used by computer and rounding error.

Could someone please elaborate on this arithmetic precision and rounding error?

Floating point arithmetic is an approximation to arithmetic with real numbers. It’s an approximation in the sense that all digits of a number aren’t stored, but instead are truncated to a certain level of precision. This creates errors, because values like $sqrt{2}$, which have an unending sequence of digits, can’t be stored (because you don’t have enough memory to store an unending sequence of digits). This what is meant by “finite-precision”: only the largest digits are stored.

Floating point values are represented to within some tolerance, called machine epsilon or $epsilon$, which is the upper bound of the relative error due to rounding.

When you compose multiple operations which have finite precision, these rounding errors can accumulate, resulting in larger differences.

In the case of zero singular values, this means that due to rounding error, some singular values which are truly zero will be stored as a nonzero value.

An example: some matrix $A$ has singular values $[2,1,0.5,0]$. But your SVD algorithm may return singular values 2.0, 1.0, 0.5, 2.2e-16 or a similarly small number. That final value is numerically zero; it’s zero to within the numerical tolerance of the algorithm.

The floating point standard is governed by IEEE 754.

TLDR;
In computers numbers are stored in finite slots of memory. For instance, an integer number in mathematics is whole number such as …,-2,-1,0,1,2,3,… that can go in both directions from negative infinity to positive infinity. In a computer this number can be represented by a type such as int8_t (in C++) which spans from -128 to 127. The situation is even worse with real numbers, such as $pi$ or $sqrt 2$. That’s what is meant by the author.

The long answer can be as long as you have time for. For instance,
What Every Computer Scientist Should Know About Floating-Point Arithmetic” is a required read for anyone who does numbers on a computer.
I’ll touch on three subjects.

Computer Integers lack some properties of mathematical integral numbers

Not only integer types are bounded, but they also lack some properties you expect from integral numbers. For instance, in math you expect given $a>0$ and $b>0$ that $a+b>0$ too. Yet, it may not be the case in computer math. For instance, the following code output 110 and not 111 as you’d expect:

#include <iostream>

int main() {
  short int a = 17000, b = 17000, r;
  std::cout << (a > 0);
  std::cout << (b > 0);
  r = a + b;
  std::cout << (r > 0);

}

Computer “real” numbers are countable

The real numbers in mathematics are not countable. That’s the huge difference of real numbers from integral and rational numbers. It was a huge breakthrough for European math when Stevin introduced the notion of real numbers, e.g. $sqrt 2$. They fill the gaps between rational numbers such as 1/3.

Although the number of both real and integral numbers is infinite, there are more real numbers than integral numbers. Weirder though the number of positive and negative whole numbers is the same in math 🙂

These properties are not preserved in computer math. For instance, there’s exactly the same, and finite!, number of double precision real and long integer numbers in C++. It’s $2^{64}$ numbers to be precise. So, the cardinality (power set) of what is supposed to be continuum is equal to that of integral (whole) numbers!

arbitrary precision math

Due to these limitation some esoteric math problems are impossible to work on using the standard machine arithmetic. So mathematicians creates libraries for so called arbitrary precision arithmetic libraries that can greatly expand the ranges of numbers stored in a computer. However, “arbitrary” is still a finite notion. When it comes to real numbers they approximate the math concept better than standard machine arithmetic, but they don’t fully implement it.

Tagged : / / / /

Server Bug Fix: What is finite precision arithmetic and how does it affect SVD when computed by computers?

Original Source Link

Was reading the paper “DETECTING AND ASSESSING THE PROBLEMS CAUSED BY MULTICOLLINEARITY:A USE OF THE SINGULAR-VALUE DECOMPOSITION” by David Belsley and Virginia Klema.

After performing SVD, while counting the number of non-zero singular values, it is stated in the paper that

problems arise because computers use finite arithmetic …

More specifically, eigenvalues that are supposed to be zero are stored as non-zero eigenvalues due to arithmetic precision used by computer and rounding error.

Could someone please elaborate on this arithmetic precision and rounding error?

Floating point arithmetic is an approximation to arithmetic with real numbers. It’s an approximation in the sense that all digits of a number aren’t stored, but instead are truncated to a certain level of precision. This creates errors, because values like $sqrt{2}$, which have an unending sequence of digits, can’t be stored (because you don’t have enough memory to store an unending sequence of digits). This what is meant by “finite-precision”: only the largest digits are stored.

Floating point values are represented to within some tolerance, called machine epsilon or $epsilon$, which is the upper bound of the relative error due to rounding.

When you compose multiple operations which have finite precision, these rounding errors can accumulate, resulting in larger differences.

In the case of zero singular values, this means that due to rounding error, some singular values which are truly zero will be stored as a nonzero value.

An example: some matrix $A$ has singular values $[2,1,0.5,0]$. But your SVD algorithm may return singular values 2.0, 1.0, 0.5, 2.2e-16 or a similarly small number. That final value is numerically zero; it’s zero to within the numerical tolerance of the algorithm.

The floating point standard is governed by IEEE 754.

TLDR;
In computers numbers are stored in finite slots of memory. For instance, an integer number in mathematics is whole number such as …,-2,-1,0,1,2,3,… that can go in both directions from negative infinity to positive infinity. In a computer this number can be represented by a type such as int8_t (in C++) which spans from -128 to 127. The situation is even worse with real numbers, such as $pi$ or $sqrt 2$. That’s what is meant by the author.

The long answer can be as long as you have time for. For instance,
What Every Computer Scientist Should Know About Floating-Point Arithmetic” is a required read for anyone who does numbers on a computer.
I’ll touch on three subjects.

Computer Integers lack some properties of mathematical integral numbers

Not only integer types are bounded, but they also lack some properties you expect from integral numbers. For instance, in math you expect given $a>0$ and $b>0$ that $a+b>0$ too. Yet, it may not be the case in computer math. For instance, the following code output 110 and not 111 as you’d expect:

#include <iostream>

int main() {
  short int a = 17000, b = 17000, r;
  std::cout << (a > 0);
  std::cout << (b > 0);
  r = a + b;
  std::cout << (r > 0);

}

Computer “real” numbers are countable

The real numbers in mathematics are not countable. That’s the huge difference of real numbers from integral and rational numbers. It was a huge breakthrough for European math when Stevin introduced the notion of real numbers, e.g. $sqrt 2$. They fill the gaps between rational numbers such as 1/3.

Although the number of both real and integral numbers is infinite, there are more real numbers than integral numbers. Weirder though the number of positive and negative whole numbers is the same in math 🙂

These properties are not preserved in computer math. For instance, there’s exactly the same, and finite!, number of double precision real and long integer numbers in C++. It’s $2^{64}$ numbers to be precise. So, the cardinality (power set) of what is supposed to be continuum is equal to that of integral (whole) numbers!

arbitrary precision math

Due to these limitation some esoteric math problems are impossible to work on using the standard machine arithmetic. So mathematicians creates libraries for so called arbitrary precision arithmetic libraries that can greatly expand the ranges of numbers stored in a computer. However, “arbitrary” is still a finite notion. When it comes to real numbers they approximate the math concept better than standard machine arithmetic, but they don’t fully implement it.

Tagged : / / / /

Math Genius: Find all solutions of least squares problem

Original Source Link

I have the following exercise (this is exercise 4.39 of Fundamentals of Matrix Comuptations – Watkins) :enter image description here

I am not sure about how to find all the solutions(item e). I think I must use itens c) and d) but I don’t see how to do it .
I found the minimal norm solution :

$x_{mn} = A^{dagger}$ b = $big(begin{smallmatrix}
3/35\
6/35
end{smallmatrix}big)$

The last column of $V$ is an orthonormal basis for $ mathcal{N}(A)$ and this column is $big(begin{smallmatrix}
-2/sqrt5\
1/sqrt5
end{smallmatrix}big)$

Any help will be aprecciated

In this answer I explained that the set of all solutions of the least-squares problem can be written as ${x_{text{mn}} + z : z in mathcal{N}(A) }$. Here, $x_{text{mn}} = A^+ b$ is the minimum-norm solution and $mathcal{N}(A)$ is the kernel of $A$.

You were able to calculate $x_{text{mn}} = frac{1}{35}(3, 6)$ and you obtained that the vector $frac{1}{sqrt{5}} (-2, 1)$ spans the kernel of $A$. Therefore, the desired solution set is

$${x_{text{mn}} + z : z in mathcal{N}(A) } = {(3/35, 6/35) + t ( -2, 1) : t in mathbb{R} }.$$

Tagged : / / / /

Math Genius: Unique Cholesky factorization of a matrix exists iff $A$ is symmetric and positive definite

Original Source Link

Let $A=L^TL$ be the unique Cholesky factorization of a matrix. Show that such a factorization exists iff $A$ is symmetric and positive definite.

For the direction $Rightarrow$ I have done the following:

We assume that there is a unique Cholesky factorization of $A$, $A=L^TL$.

Then $A^T=(L^TL)^T=L^TL=A$ and so $A$ is symmetric.

Let $xneq 0$ then $x^TAx=x^TL^TLx=(Lx)^T(Lx)=|Lx|^2geq 0$

It is left to show that $x^TAxiff x=0$. Do we have that $L$ is invertible?

Could you give me a hint for the other direction?

Tagged : / / / /