CSCI 2227, Introduction to Scientific Computation Prof. Alvarez Moore-Penrose pseudo-inverse and least squares Rank of a matrix Number of independent "dimensions" in set of rows or columns Vectors x, y are (linearly) independent if neither can be written as a scaling of the other Vectors x1...xn (lin) indep if none is lin comb of others Examples: [1 2 3; 4 5 6] has rank 2 because the two rows are lin indep (not scalings of e.o.) [1 2 3; 4 5 6; 7 8 9] has rank 2 also; why? because rows 1,2 lin indep, but row 3 is lin comb of 1,2 The largest possible rank of a matrix is min(#rows,#cols) a matrix with that rank is said to have full rank like [1 2 3; 4 5 6] The notion of rank is important n x n matrix is invertible only if it has full rank Moore-Penrose pseudo-inverse Recall that we initially solved the least squares problem X*a = b, where X is n x m with n >= m, by multiplying on both sides by the transpose of X: X'*X*a = X'*b and then by the inverse of X'*X: a = (X'*X)^(-1) * X' * b Comparing the result with the original problem, it seems that (X'*X)^(-1) * X' is an inverse of X, since multiplying by it got rid of the X on the left side; in fact: (X'*X)^(-1) * X'*X = I(m) (X'*X)^(-1) * X' is (Moore-Penrose) pseudo-inverse of X (pinv in MATLAB) note that it has size m x n pinv(X) cannot be a right inverse of X if n > m: if you multiply X on right by pinv(X), you get n x n matrix that has rank at most m, hence cannot be the n x n identity Rank deficiency If X has rank less than m, then X'*X has rank less than m, so it is not invertible pinv definition (X'*X)^(-1) * X' doesn't make sense then In fact, system X*a = b has many solutions if X is rank-deficient a = pinv(X)*b is just one of them, namely the one with the smallest norm among all possible solutions Since any solution to X*a=b is a least squares approximation, pinv(X) is therefore the matrix Z that minimizes ||X*Z*b - b|| and ||Z|| simultaneously The solution returned by the backslash operator, a = X\b, is a "basic" solution, which means that it has as many 0 elements as possible among all solutions Relationship between pseudo-inverse and QR solutions If X is n x m, with n >= m and full rank (m), then the pinv and QR techniques give the same solution Example: X = randi(10,4,3) rank(X) b = randi(10,4,1) aQR = X\b apinv = pinv(X)*b If X has rank less than m, then things change: the system X*a = b has many solutions, and the two techniques give different ones Example: X = randi(10,2,3) X = [X; [1 2]*X] X = [X; [1 2 3]*X] rank(X) b = randi(10,4,1) aQR = X\b apinv = pinv(X)*b norm(X*aQR - b) norm(X*apinv - b) norm(aQR) norm(apinv) Try this and compare the two pinv and backslash solutions How do the SSR values compare? How do the norms of the solutions compare?