Elementary matrices and LU decompositions

We have seen that row operations on a matrix can be accomplished by left multiplication by elementary matrices, as the examples below illustrate:

[ 1 k 0 ]   [ a b c ]     [ a+kd  b+ke  c+kf ]
[ 0 1 0 ] * [ d e f ] =   [  d     e     f   ]
[ 0 0 1 ]   [ g h i ]     [  g     h     i   ]

[ 1 0 0 ]   [ a b c ]     [  a   b   c  ]
[ 0 1 0 ] * [ d e f ] =   [  d   e   f  ]
[ 0 0 k ]   [ g h i ]     [  kg  kh  ki ]  

[ 0 0 1 ]   [ a b c ]     [ g   h   i ]
[ 0 1 0 ] * [ d e f ] =   [ d   e   f ]
[ 1 0 0 ]   [ g h i ]     [ a   b   c ]

To construct a matrix that performs a given row operation, we just perform that row operation on the identity matrix. In the examples above we see that the 3 x 3 matrix that adds k*row 2 to row 1 is the result of performing this row operation on the matrix

[ 1 0 0 ]   row1 + k*row2       [ 1 k 0 ]
[ 0 1 0 ] ------------------->  [ 0 1 0 ]
[ 0 0 1 ]                       [ 0 0 1 ]

and similarly for any other row operation. This allows us to think of solving for the inverse of A in a somewhat different way:

    [  1 1 0 ]  add r1   [  1  1  0 ] r1-r2    [ 1  0  -1 ]
A = [  0 1 1 ]  ---->    [  0  1  1 ] -------> [ 0  1   1 ]
    [ -1 1 1 ]  to r3    [  0  2  1 ] r3-2r2   [ 0  0  -1 ]

and so on. If we next subtract r3 from r1 and add r3 to r2 and then multiply r3 by -1, we will get the identity matrix. In terms of elementary matrices we find that

[ 1 0  0 ] [1 0 0] [1 0 -1] [1 -1 0] [1  0 0] [1 0 0]    [1 0 0]
[ 0 1  0 ]*[0 1 1]*[0 1  0]*[0  1 0]*[0  1 0]*[0 1 0]*A =[0 1 0] 
[ 0 0 -1 ] [0 0 1] [0 0  1] [0  0 1] [0 -2 1] [1 0 1]    [0 0 1]

r3=-1*r3, r2=r2+r3, r1=r1-r3,r1=r1-r2,r3=r3-2r2,r3=r1+r3

Note that we have constructed the inverse matrix. If we multiply all the matrices on the left, they must equal A^(-1). You can check that the product of matrices on the left is

         [ 0  1 -1]
A^(-1) = [ 1 -1  1]
         [-1  2 -1]

More importantly, elementary matrices give a way to factor a matrix into a product of simpler matrices. One important application of this is the LU decomposition for a matrix A. In the example we did in class, we start with A and subtract 2*row1 from row 2, subtract 2*row1 from row 3 and then add row 2 to row 3 to get an upper trianglar matrix U:

    [ 1 2 3 ]      [1  2  3]      [1 2  3]   
A = [ 2 6 7 ] ---> [0  2  1] ---> [0 2  1] = U  
    [ 2 2 4 ]      [0 -2 -2]      [0 0 -1]

In terms of elementary matrices, we have

[1 0 0 ][ 1 0 0][ 1 0 0][1 2 3]   [1 2  3]
[0 1 0 ][ 0 1 0][-2 1 0][2 6 7] = [0 2  1]
[0 1 1 ][-2 0 1][ 0 0 1][2 2 4]   [0 0 -1]

Since we always add or subtract a row that lies above the row we are modifying, the corresponding elementary matrix will be LOWER triangular. This is not quite what we wanted. We want a single lower triangular matrix on the other side. But this is easy to construct --- Just multiply both sides by the inverse row operations to get:

[1 2 3] [1 0 0][1 0 0][1  0 0][1 2  3]
[2 6 7]=[2 1 0][0 1 0][0  1 0][0 2  1]
[2 2 4] [0 0 1][2 0 1][0 -1 1][0 0 -1]

We saw in the homework that the product of lower triangular matrices is lower triangular. Multiplying this out gives

[1 2 3]   [1  0 0][1 2  3]
[2 6 7] = [2  1 0][0 2  1]
[2 2 4]   [2 -1 1][0 0 -1]

So we have written A in the form LU as we desired. We don't really need to form these lower triangular matrices and multiply them out. We can construct L at the same time we construct U as follows: Do row operations on A to transform A into U. Do the inverse row operations on the identity in the opposite order to construct L.

    [ 1 2 3 ]      [1  2  3]      [1 2  3]   
A = [ 2 6 7 ] ---> [0  2  1] ---> [0 2  1] = U  
    [ 2 2 4 ]      [0 -2 -2]      [0 0 -1]  
            r2-2r1          r3+r2
            r3-2r1

I = [ 1 0 0 ]      [1  0 0]       [1  0 0]      [1  0 0]
    [ 0 1 0 ] ---> [0  1 0] --->  [0  1 0] ---> [2  1 0] = L
    [ 0 0 1 ]      [0 -1 1]       [2 -1 1]      [2 -1 1]
             r3-r2         r3+2r1          r2+2r1

This LU decomposition is very useful for solving linear systems Ax=b efficiently. Notice that it takes much fewer operations to transform A to U than it takes to transform A into reduced row echelon form. The larger A is, the more work you save. By the time you have found U you have done most of the work to construct L. To solve LUx=b is much easier than to solve Ax=b. (See exercise 59 in 2.4 of the text for an explanation)