Elementary Matrices, Inverses and the LU decomposition

In class we saw that every row operation can be viewed as left multiplication by an elementary matrix. This gives us a different way to think about finding inverses:

Example 1: Find the inverse of A if

    A = [ 1   2 ]

        [ 1   3 ]

We know that A is invertible if and only if it row reduces to the identity matrix. In this case:

   [ 1   2 ]  --------------->  [ 1   2 ] ----------------> [ 1   0 ]

   [ 1   3 ]  subtract row 1    [ 0   1 ] subtract 2*row 2  [ 0   1 ]

               from row 2                   from row 1

The elementary matrix that does the row operation of subtracting row 1 from row 2 and the elementary matrix that subtracts 2*row2 from row 1 are given below:

 [  1  0 ]    ( subtract row 1 from row 2        and    [ 1   -2 ]   (subtract 2*row2 from row 1

 [ -1  1 ]      in the identity matrix )                [ 0    1 ]     in the identity matrix)

Since performing these row operations transforms A into the identity matrix, it must also be true that

   [ 1  -2 ] * [ 1  0 ] * [ 1  2 ]  = [ 1  0 ]

   [ 0   1 ]   [-1  1 ]   [ 1  3 ]    [ 0  1 ]

and therefore the product of the two matrices on the left must be the inverse matrix for A:

    [ 1  -2 ] * [ 1  0 ] = [ 3   -2 ]  = the inverse of A

    [ 0   1 ]   [-1  1 ]   [-1    1 ]

The approach described above for finding the inverse of a matrix as the product of elementary matrices is often useful in proving theorems about matrices and linear systems. It is also important in developing the most efficient method for solving the system Ax = b. This method we describe below:

The LU decomposition

In general when we solve systems Ax = b for large matrices A it becomes important to find a method that is efficient -- that is, a method that involves as little computation as possible. If we start with the matrix A and augment with the vector b and then do row operations to bring A into reduced row echelon form, then we will certainly solve the system. However, it may be possible to solve the system without completely reducing the matrix A. We illustrate by solving the following system:

    [ 2  3  1 ] [x_1]     [ -4 ]

    [ 4  1  4 ] [x_2]  =  [  9 ]

    [ 3  4  6 ] [x_3]     [  0 ]

Do just enough row operations to bring the coefficient matrix into upper triangular form:

 [ 2  3  1 ]    ------>   [ 2   3    1 ]   --------->  [ 2   3    1 ]

 [ 4  1  4 ]              [ 0  -5    2 ]               [ 0   -5   2 ]

 [ 3  4  6 ]              [ 0 -1/2  9/2]               [ 0    0  4.3]

        subtract 2*row1 from row2      subtract 1/10 of row 2 from row 3

      subtract 3/2 of row 1 from row 2        so left multiply by the matrix

      so left multiply by the matrix               [ 1   0   0 ]

           [  1    0    0  ]                       [ 0   1   0 ]

           [ -2    1    0  ]                       [ 0 -1/10 1 ]

           [-3/2   0    1  ]

Note that we use only higher rows in the matrix to adjust lower rows so the elementary matrices we used are all lower triangular. Also they combine to give a single lower triangular matrix that converts A to upper triangular form:

    [ 1     0    0 ]   [ 2   3   1 ]       [ 2   3    1 ]

    [-2     1    0 ] * [ 4   1   4 ]  =    [ 0   -5   2 ]

    [-3/2 -1/10  1 ]   [ 3   4   6 ]       [ 0   0   4.3]

The inverse of the matrix on the left is particularly simple. We just change all the minus signs to plus signs. This is simply because the inverse of subtracting twice row 1 from row 2 is just adding twice row 1 to row 2 and so on. We get the LU decomposition of A as the product of a lower triangular matrix with an upper triangular one:

        [ 2   3  1 ]      [ 1     0     0 ]   [ 2    3     1 ]

        [ 4   1  4 ]  =   [ 2     1     0 ] * [ 0   -5     2 ]

        [ 3   4  6 ]      [3/2   1/10   1 ]   [ 0    0    4.3]

Note that this decomposition can be written down immediately once we see how to transform A into an upper triangular matrix. We just have to put the multipliers 2, 3/2 and 1/10 into the appropriate spots in the matrix. We don't have to go through these intermediate steps. Now how does this help us solve the original system? Well first we rewrite it by replacing A by its LU decomposition.

  [ 1     0     0 ]   [ 2    3     1 ][x_1]     [ -4 ]

  [ 2     1     0 ] * [ 0   -5     2 ][x_2]  =  [  9 ]

  [3/2   1/10   1 ]   [ 0    0    4.3][x_3]     [  0 ]

Now we make a substitution. We let

[ 2    3     1 ][x_1]      [y_1]

[ 0   -5     2 ][x_2]  =   [y_2]

[ 0    0    4.3][x_3]      [y_3]

Then we can solve for (y1,y2,y3) by a process similar to back-substitution:

        y_1                    = -4

       2y_1 + y_2              =  9    -------> y_2 = 9 - 2(-4) = 17

   (3/2)y_1 + (1/10)y_2  + y_3 =  0    -------> y_3 = -(3/2)(-4) -(1/10)(17) = 6 - 1.7 = 4.3

And a similar computation now tells us the solution (x1,x2,x3):

      2x_1 + 3x_2  +  x_3 = -4

            -5x_2  +  2x_3 = 17

                    4.3x_3 = 4.3    ------------>  x_3 = 1

                                                   x_2 = (17 - 2(1))/-5 = -3

                                                   x_1 = (-4 - 1 - 3(-3))/2 = 2