Hello,
I have a problem with time series analysis. I have a dataset with 5 features. Following is the subset of my input dataset:
date,price,year,day,totaltx
1/1/2016 0:00,434.46,2016,1,126762
1/2/2016 0:00,433.59,2016,2,147449
1/3/2016 0:00,430.36,2016,3,148661
1/4/2016 0:00,433.49,2016,4,185279
1/5/2016 0:00,432.25,2016,5,178723
1/6/2016 0:00,429.46,2016,6,184207
My endogenous data is price column and exogenous data is totaltx price.
This is the code I am running and getting an error:
import statsmodels.api as sm
import pandas as pd
import numpy as np
from numpy.linalg import LinAlgError
def arima(filteredData, coinOutput, window, horizon, trainLength):
start_index = 0
end_index = 0
inputNumber = filteredData.shape[0]
predictions = np.array([], dtype=np.float32)
prices = np.array([], dtype=np.float32)
# sliding on time series data with 1 day step
while ((end_index) < inputNumber - 1):
end_index = start_index + trainLength
trainFeatures = filteredData[start_index:end_index]["totaltx"]
trainOutput = coinOutput[start_index:end_index]["price"]
arima = sm.tsa.statespace.SARIMAX(endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0))
arima_fit = arima.fit(disp=0)
testdata=filteredData[end_index:end_index+1]["totaltx"]
total_sample = end_index-start_index
predicted = arima_fit.predict(start=total_sample, end=total_sample, exog=np.array(testdata.values).reshape(-1,1))
price = coinOutput[end_index:end_index + 1]["price"].values
predictions = np.append(predictions, predicted)
prices = np.append(prices, price)
start_index = start_index + 1
return predictions, prices
def processCoins(bitcoinPrice, window, horizon):
output = bitcoinPrice[horizon:][["date", "day", "year", "price"]]
return output
trainLength=100;
for window in [3,5]:
for horizon in [1,2,5,7,10]:
bitcoinPrice = pd.read_csv("..\prices.csv", sep=",")
coinOutput = processCoins(bitcoinPrice, window, horizon)
predictions, prices = arima(bitcoinPrice, coinOutput, window, horizon, trainLength)
In this code, I am using rolling window regression technique. I am training arima for start_index:end_index and predicting the test data with end_index:end_index+1
This the error that is thrown from my code:
Traceback (most recent call last):
File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 115, in <module>
predictions, prices = arima(filteredBitcoinPrice, coinOutput, window, horizon, trainLength, outputFile)
File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 64, in arima
arima_fit = arima.fit(disp=0)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacemlemodel.py", line 469, in fit
skip_hessian=True, **kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbasemodel.py", line 466, in fit
full_output=full_output)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbaseoptimizer.py", line 191, in _fit
hess=hessian)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbaseoptimizer.py", line 410, in _fit_lbfgs
**extra_kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizelbfgsb.py", line 193, in fmin_l_bfgs_b
**opts)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizelbfgsb.py", line 328, in _minimize_lbfgsb
f, g = func_and_grad(x)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizelbfgsb.py", line 273, in func_and_grad
f = fun(x, *args)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizeoptimize.py", line 292, in function_wrapper
return function(*(wrapper_args + args))
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbasemodel.py", line 440, in f
return -self.loglike(params, *args) / nobs
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacemlemodel.py", line 646, in loglike
loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacekalman_filter.py", line 825, in loglike
kfilter = self._filter(**kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacekalman_filter.py", line 747, in _filter
self._initialize_state(prefix=prefix, complex_step=complex_step)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacerepresentation.py", line 723, in _initialize_state
self._statespaces[prefix].initialize_stationary(complex_step)
File "_representation.pyx", line 1351, in statsmodels.tsa.statespace._representation.dStatespace.initialize_stationary
File "_tools.pyx", line 1151, in statsmodels.tsa.statespace._tools._dsolve_discrete_lyapunov
numpy.linalg.linalg.LinAlgError: LU decomposition error.
I believe that there is a bug in statsmodels if I do not have any error. Can you please help me to solve it?
I have a problem with time series analysis. I have a dataset with 5 features. Following is the subset of my input dataset:
date,price,year,day,totaltx
1/1/2016 0:00,434.46,2016,1,126762
1/2/2016 0:00,433.59,2016,2,147449
1/3/2016 0:00,430.36,2016,3,148661
1/4/2016 0:00,433.49,2016,4,185279
1/5/2016 0:00,432.25,2016,5,178723
1/6/2016 0:00,429.46,2016,6,184207
My endogenous data is price column and exogenous data is totaltx price.
This is the code I am running and getting an error:
import statsmodels.api as sm
import pandas as pd
import numpy as np
from numpy.linalg import LinAlgError
def arima(filteredData, coinOutput, window, horizon, trainLength):
start_index = 0
end_index = 0
inputNumber = filteredData.shape[0]
predictions = np.array([], dtype=np.float32)
prices = np.array([], dtype=np.float32)
# sliding on time series data with 1 day step
while ((end_index) < inputNumber - 1):
end_index = start_index + trainLength
trainFeatures = filteredData[start_index:end_index]["totaltx"]
trainOutput = coinOutput[start_index:end_index]["price"]
arima = sm.tsa.statespace.SARIMAX(endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0))
arima_fit = arima.fit(disp=0)
testdata=filteredData[end_index:end_index+1]["totaltx"]
total_sample = end_index-start_index
predicted = arima_fit.predict(start=total_sample, end=total_sample, exog=np.array(testdata.values).reshape(-1,1))
price = coinOutput[end_index:end_index + 1]["price"].values
predictions = np.append(predictions, predicted)
prices = np.append(prices, price)
start_index = start_index + 1
return predictions, prices
def processCoins(bitcoinPrice, window, horizon):
output = bitcoinPrice[horizon:][["date", "day", "year", "price"]]
return output
trainLength=100;
for window in [3,5]:
for horizon in [1,2,5,7,10]:
bitcoinPrice = pd.read_csv("..\prices.csv", sep=",")
coinOutput = processCoins(bitcoinPrice, window, horizon)
predictions, prices = arima(bitcoinPrice, coinOutput, window, horizon, trainLength)
In this code, I am using rolling window regression technique. I am training arima for start_index:end_index
and predicting the test data with end_index:end_index+1
This the error that is thrown from my code:
Traceback (most recent call last):
File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 115, in <module>
predictions, prices = arima(filteredBitcoinPrice, coinOutput, window, horizon, trainLength, outputFile)
File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 64, in arima
arima_fit = arima.fit(disp=0)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacemlemodel.py", line 469, in fit
skip_hessian=True, **kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbasemodel.py", line 466, in fit
full_output=full_output)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbaseoptimizer.py", line 191, in _fit
hess=hessian)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbaseoptimizer.py", line 410, in _fit_lbfgs
**extra_kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizelbfgsb.py", line 193, in fmin_l_bfgs_b
**opts)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizelbfgsb.py", line 328, in _minimize_lbfgsb
f, g = func_and_grad(x)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizelbfgsb.py", line 273, in func_and_grad
f = fun(x, *args)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesscipyoptimizeoptimize.py", line 292, in function_wrapper
return function(*(wrapper_args + args))
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelsbasemodel.py", line 440, in f
return -self.loglike(params, *args) / nobs
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacemlemodel.py", line 646, in loglike
loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacekalman_filter.py", line 825, in loglike
kfilter = self._filter(**kwargs)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacekalman_filter.py", line 747, in _filter
self._initialize_state(prefix=prefix, complex_step=complex_step)
File "C:AppDataLocalContinuumAnaconda3libsite-packagesstatsmodelstsastatespacerepresentation.py", line 723, in _initialize_state
self._statespaces[prefix].initialize_stationary(complex_step)
File "_representation.pyx", line 1351, in statsmodels.tsa.statespace._representation.dStatespace.initialize_stationary
File "_tools.pyx", line 1151, in statsmodels.tsa.statespace._tools._dsolve_discrete_lyapunov
numpy.linalg.linalg.LinAlgError: LU decomposition error.
In this section we look at the some other algorithms for solving the
equation (Ax=b) when (A) is invertible. On the one hand the (QR)
factorisation has great stability properties. On the other, it can be
beaten by other methods for speed when there is particular structure
to exploit (such as lots of zeros in the matrix). In this section, we
explore the the family of methods that go right back to the technique
of Gaussian elimination, that you will have been familiar with since
secondary school.
4.1. An algorithm for LU decomposition¶
Supplementary video
The computational way to view Gaussian elimination is through the LU
decomposition of an invertible matrix, (A=LU), where (L) is lower
triangular ((l_{ij}=0) for (j>i)) and (U) is upper triangular
((u_{ij}=0) for (j<i)). Here we use the symbol (U) instead of (R) to
emphasise that we are looking as square matrices. The process of
obtaining the (LU) decomposition is very similar to the Householder
algorithm, in that we repeatedly left multiply (A) by matrices to
transform below-diagonal entries in each column to zero, working from
the first to the last column. The difference is that whilst the
Householder algorithm left multiplies with unitary matrices, here,
we left multiply with lower triangular matrices.
The first step puts zeros below the first entry in the first column.
[ begin{align}begin{aligned}begin{split}A_1 = L_1A = begin{pmatrix}
u_1 & v_2^1 & v_2^1 & ldots & v_n^1 \
end{pmatrix},end{split}\begin{split},
u_1 = begin{pmatrix} u_{11} \ 0 \ ldots \ 0end{pmatrix}.end{split}end{aligned}end{align} ]
Then, the next step puts zeros below the second entry in the second
column.
[ begin{align}begin{aligned}begin{split}A_2 = L_2L_1A = begin{pmatrix}
u_1 & u_2 & v_2^2 & ldots & v_n^2 \
end{pmatrix},end{split}\begin{split},
u_2 = begin{pmatrix} u_{12} \ u_{22} \ 0 \ ldots \ 0 \
end{pmatrix}.end{split}end{aligned}end{align} ]
After repeated left multiplications we have
[A_n = {L_nldots L_2L_1}A = U.]
This process of transforming (A) to (U) is called Gaussian elimination.
If we assume (we will show this later) that all these lower triangular
matrices are invertible, we can define
[ begin{align}begin{aligned}L = (L_nldots L_2L_1)^{-1} = L_1^{-1}L_2^{-1}ldots L_n^{-1},\mbox{ so that }\L^{-1} = L_nldots L_2L_1.end{aligned}end{align} ]
Then we have (L^{-1}A = U), i.e. (A=LU).
Supplementary video
So, what’s the advantage of writing (A=LU)? Well, we can define
(y=Ux). Then, we can solve (Ax=b) in two steps, first solving (Ly=b)
for (y), and then solving (Ux=y) for (x). The latter equation is an
upper triangular system that can be solved by the back
substitution algorithm we introduced for QR factorisation. The former
equation can be solved by forward substitution, derived in an analogous
way, written in pseudo-code as follows.
-
(x_1 gets b_1/L_{11})
-
FOR (i= 2) TO (m)
-
(x_i gets (b_i — sum_{k=1}^{i-1}L_{ik}x_k)/L_{ii})
-
Forward substitution has an operation count that is identical to back
substitution, by symmetry, i.e. (mathcal{O}(m^2)). In contrast, we
shall see shortly that the Gaussian elimination process has an
operation count (mathcal{O}(m^3)). Hence, it is much cheaper to solve
a linear system with a given (LU) factorisation than it is to form (L)
and (U) in the first place. We can take advantage of this in the
situation where we have to solve a whole sequence of linear systems
(Ax=b_i), (i=1,2,ldots,K), with the same matrix (A) but different
right hand side vectors. In this case we can pay the cost of forming
(LU) once, and then use forward and back substitution to cheaply solve
each system. This is particularly useful when we need to repeatedly
solve systems as part of larger iterative algorithms, such as time
integration methods or Monte Carlo methods.
Supplementary video
So, we need to find lower triangular matrices (L_k) that do not change
the first (k-1) rows, and transforms the (k)-th column (x_k) of (A_k)
as follows.
[begin{split}Lx_k = Lbegin{pmatrix}
x_{1k}\
vdots\
x_{kk}\
x_{k+1,k}\
vdots\
x_{m,k}\
end{pmatrix}
= begin{pmatrix}
x_{1k}\
vdots\
x_{kk}\
0 \
vdots\
0 \
end{pmatrix}.end{split}]
As before with the Householder method, we see that we need the top-left
(ktimes k) submatrix of (L) to be the identity (so that it doesn’t change
the first (k) rows). We claim that the following matrix transforms
(x_k) to the required form.
[ begin{align}begin{aligned}begin{split}L_k = begin{pmatrix}
1 & 0 & 0 & ldots & 0 & ldots & ldots & ldots & 0 \
0 & 1 & 0 & ldots & 0 & ldots & ldots& vdots & 0 \
0 & 0 & 1 & ldots & 0 & ldots & ldots & vdots & 0 \
vdots & ddots & ddots & ddots & vdots & vdots & vdots & vdots & 0 \
vdots & ddots & ddots & ddots & 1 & 0 & ldots & vdots & 0 \
vdots & ddots & ddots & ddots & -l_{k+1,k} & 1 & ldots & vdots & 0 \
vdots & ddots & ddots & ddots & -l_{k+2,k} & 0 & ddots & vdots & 0 \
vdots & ddots & ddots & ddots & vdots & 0 & ldots & ddots & 0 \
vdots & ddots & ddots & ddots & -l_{m,k} & 0 & ldots & ldots &1 \
end{pmatrix},end{split}\quad\begin{split}l_k = begin{pmatrix}
0 \
0 \
0 \
vdots \
0 \
l_{k+1,k}=x_{k+1,k}/x_{kk} \
l_{k+2,k}= x_{k+2,k}/x_{kk} \
vdots\
l_{m,k} = x_{m,k}/x_{kk} \
end{pmatrix}.end{split}end{aligned}end{align} ]
This has the identity block as required, and we can verify that (L_k)
puts zeros in the entries of (x_k) below the diagonal by first writing
(L_k = I — l_ke_k^*). Then,
[L_kx_k = I — l_ke_k^* = x_k — l_kunderbrace{(e_k^*x_k)}_{=x_{kk}},]
which subtracts off the below diagonal entries as required. Indeed,
multiplication by (L_k) implements the row operations that are performed
to transform below diagonal elements of (A_k) to zero during Gaussian
elimination.
Supplementary video
The determinant of a lower triangular matrix is equal to the trace
(product of diagonal entries), so (det(L_k)=1), and consequently
(L_k) is invertible, enabling us to define (L^{-1}) as above.
To form (L) we need to multiply the inverses of all the (L_k) matrices
together, also as above. To do this, we first note that (l_k^*e_k=0)
(because (l_k) is zero in the only entry that (e_k) is nonzero). Then
we claim that (L_k^{-1}=I + l_ke_k^*), which we verify as follows.
[ begin{align}begin{aligned}(I + l_ke_k^*)L_k = (I + l_ke_k^*)(I — l_ke_k^*)
= I + l_ke_k^* — l_ke_k^* + (l_ke_k^*)(l_ke_k*)\= I + underbrace{l_k(e_k^*l_k)e_k*}_{=0} = I,end{aligned}end{align} ]
as required. Similarly if we multiply the inverse lower triangular
matrices from two consecutive iterations, we get
[ begin{align}begin{aligned}L_k^{-1}L_{k+1}^{-1} = (I + l_ke_k^*)(I + l_{k+1}e_{k+1}^*)
= I + l_ke_k^* + l_{k+1}e_{k+1}^* + l_kunderbrace{(e_k^*l_{k+1})}_{=0}e_{k+1}^*\= I + l_ke_k^* + l_{k+1}e_{k+1}^*,end{aligned}end{align} ]
since (e_k^*l_{k+1}=0) too, as (l_{k+1}) is zero in the only place
where (e_k) is nonzero. If we iterate this argument, we get
[L = I + sum_{i=1}^{m-1}l_ie_i^*.]
Hence, the (k) is the same as the (k),
i.e.,
[begin{split}L = begin{pmatrix}
1 & 0 & 0 & ldots & 0 & ldots & ldots & ldots & 0 \
l_{21} & 1 & 0 & ldots & 0 & ldots & ldots& vdots & 0 \
l_{31} & l_{32} & 1 & ldots & 0 & ldots & ldots & vdots & 0 \
vdots & ddots & ddots & ddots & vdots & vdots & vdots & vdots & 0 \
vdots & ddots & ddots & ddots & 1 & 0 & ldots & vdots & 0 \
vdots & ddots & ddots & ddots & l_{k+1,k} & 1 & ldots & vdots & 0 \
vdots & ddots & ddots & ddots & l_{k+2,k} & l_{k+2,k+1} & ddots & vdots & 0 \
vdots & ddots & ddots & ddots & vdots & l_{m-1,k+1} & ldots & ddots & 0 \
vdots & ddots & ddots & ddots & l_{m,k} & l_{m,k+1} & ldots & ldots &1 \
end{pmatrix}.end{split}]
In summary, we can compute entries of (L) during the Gaussian elimination
process of transforming (A) to (U). Note that the matrices (L_1,L_2,ldots)
should not be explicitly formed during the elimination process, they are just
a mathematical concept to translate from the row operations into the final
(L) matrix.
Exercise 4.1
Having said that, let’s take a moment to compute some examples
using the (L_1,L_2,ldots) matrices (to help with understanding).
The cla_utils.exercises6.get_Lk()
function has been left
unimplemented. It should return one of these matrices given the
(l_k) entries. The test script test_exercises6.py
in the
test
directory will test this function.
Once it passes the tests, experiment with the inverse and
multiplication properties above, to verify that they work.
Supplementary video
The Gaussian elimination algorithm is written in pseudo-code as
follows. We start by copying (A) into (U), and setting (L) to
an identity matrix, and then work “in-place” i.e. replacing values
of (U) and (L) until they are completed. In a computer implementation,
this memory should be preallocated and then written to instead of
making copies (which carries overheads).
-
(U gets A)
-
(L gets I)
-
FOR (k=1) TO (m-1)
-
for (j=k+1) TO (m)
-
(l_{jk} gets u_{jk}/u_{kk})
-
(u_{j,k:m} gets u_{j,k:m} — l_{jk}u_{k,k:m})
-
-
END FOR
-
-
END FOR
To do an operation count for this algorithm, we note that the
dominating operation is the update of (U) inside the (j) loop. This
requires (m-k+1) multiplications and subtractions, and is iterated
(m-k) times in the (j) loop, and this whole thing is iterated from
(j=k+1) to (m). Hence the asymptotic operation count is
[ begin{align}begin{aligned}N_{mbox{FLOPs}} = sum_{k=1}^{m-1}sum_{j=k+1}^m 2(m-k+1),\= sum_{k=1}^{m-1}2(m-k+1)underbrace{sum_{j={k+1}}^m 1}_{=m-k}\= sum_{k=1}^{m-1}2m^2 — 4mk + 2k^2\sim 2m^3 -4frac{m^3}{2} + frac{2m^3}{3} = frac{2m^3}{3}.end{aligned}end{align} ]
Exercise 4.2
Since the diagonal entries of (L) are all ones, the total amount of
combined memory required to store (L) and (U) is the same as the
amount of memory required to store (A). Further, each iteration of
the LU factorisation algorithm computes one column of (L) and one
rows of (U), and the corresponding column an row of (A) are not
needed for the rest of the algorithm. This creates the opportunity
for a memory-efficient ‘in-place’ algorithm in which the matrix (A)
is modified until it contains the values for (L) and (U).
((ddagger)) The cla_utils.exercises6.LU_inplace()
function
has been left unimplemented. It should implement this in-place
low-storage procedure, applying the changes to the provided matrix
(A). The test script test_exercises6.py
in the test
directory will test this function.
Exercise 4.3
((ddagger)) The LU factorisation requires 3 loops (this is why it
has a cubic FLOP count). In the algorithm above, there are two
explicit loops and one explicit one (in the slice notation). It is
possible to rewrite this in a single loop, using an outer
product. Identify this outer product, and update
cla_utils.exercises6.LU_inplace()
to make use of this
reformulation (using numpy.outer()
). Do you notice any
improvement in speed?
Exercise 4.4
((ddagger)) The function cla_utils.exercises6.solve_L()
has
been left unimplemented. It should use forward substitution to
solve lower triangular systems. The interfaces are set so that
multiple right hand sides can be provided and solved at the same
time. The functions should only use one loop over the rows of (L),
to efficiently solve the multiple problems. The test script
test_exercises6.py
in the test
directory will test these
functions.
Exercise 4.5
((ddagger)) Propose an algorithm to use the LU factorisation to
compute the inverse of a matrix. The functions
cla_utils.exercises6.inverse_LU()
has been left
unimplemented. Complete it using your algorithm, using functions
developed in the previous exercises where possible. The test script
test_exercises6.py
in the test
directory will test these
functions.
4.2. Pivoting¶
Supplementary video
Supplementary video
Gaussian elimination will fail if a zero appears on the diagonal,
i.e. we get (x_{kk}=0) (since then we can’t divide by it). Similarly,
Gaussian elimination will amplify rounding errors if (x_{kk}) is very
small, because a small error becomes large after dividing by (x_{kk}).
The solution is to reorder the rows in (A_k) so that that (x_{kk}) has
maximum magnitude. This would seem to mess up the (LU) factorisation
procedure. However, it is not as bad as it looks, as we will now
see.
The main tool is the permutation matrix.
Definition 4.6
(Permutation matrix)
An (mtimes m) permutation matrix has precisely one entry equal to
1 in every row and column, and zero elsewhere.
A compact way to store a permutation matrix (P) as a size (m) vector
(p), where (p_i) is equal to the number of the column containing the 1
entry in row (i) of (P). Multiplying a vector (x) by a permutation
matrix (P) simply rearranges the entries in (x), with ((Px)_i =
x_{p_i}).
During Gaussian elimination, say that we are at stage (k), and
((A_k)_{kk}) is not the largest magnitude entry in the (k). We reorder the rows to fix this, and this is what we call
pivoting. Mathematically this reordering is equivalent to
multiplication by a permutation matrix (P_k). Then we continue the
Gaussian elimination procedure by left multiplying by (L_k), placing
zeros below the diagonal in column (k) of (P_kA_k).
In fact, (P_k) is a very specific type of permutation matrix, that only
swaps two rows. Therefore, (P_k^{-1}=P_k), even though this is not
true for general permutation matrices.
We can pivot at every stage of the procedure, producing a permutation
matrix (P_k), (k=1,ldots, {m-1}) (if no pivoting is necessary at a given
stage, then we just take the identity matrix as the pivoting matrix
for that stage). Then, we end up with the result of Gaussian elimination
with pivoting,
[L_{m-1}P_{m-1}ldots L_2P_2L_1P_1A = U.]
Supplementary video
This looks like it has totally messed up the LU factorisation, because
(LP) is not lower triangular for general lower triangular matrix (L)
and permutation matrix (P). However, we can save the situation, by
trying to swap all the permutation matrices to the right of all of the
(L) matrices. This does change the (L) matrices, because matrix-matrix
multiplication is not commutative. However, we shall see that it does
preserve the lower triangular matrix structure.
To see how this is done, we focus on how things look after two stages
of Gaussian elimination. We have
[A_2 = L_2P_2L_1P_1A = L_2underbrace{P_2L_1P_2}_{=L_1^{(2)}}P_2P_1A
= L_2L_1^{(2)}P_2P_1A,]
having used (P_2^{-1}=P_2). Left multiplication with (P_2) exchanges
row 2 with some other row (j) with (j>2). Hence, right multiplication
with (P_2) does the same thing but with columns instead of rows.
Therefore, (L_1P_2) is the same as (L_1) but with column 2 exchanged
with column (j). Column 2 is just (e_2) and column (j) is just (e_j),
so now column 2 has the 1 in row (j) and column (j) has the 1 in
row 2. Then, (P_2L_1P_2) exchanges row 2 of (L_1P_2) with row (j) of
(L_1P_2). This just exchanges (l_{12}) with (l_{1j}), and swaps the
1s in columns 2 and (j) back to the diagonal. In summary, (P_2L_1P_2)
is the same as (L_1) but with (l_{12}) exchanged with (l_{1j}).
Moving on to the next stage, and we have
[A_3 = L_3P_3L_2L_1P_2P_1A = L_3underbrace{P_3L_2P_3}_{=L_2^{(3)}}
underbrace{P_3L_1P_3}_{=L_1^{(3)}}P_3P_2P_1A.]
By similar arguments we see that (L_2^{(3)}) is the same as (L_2) but
with (l_{23}) exchanged with (l_{2j}) for some (different) (j), and
(L_2^{(3)}) is the same as (L_2^{(2)}) with (l_{13}) exchanged with
(l_{1j}). After iterating this argument, we can obtain
[underbrace{L_{m-1}^{(m-1)}ldots L_2^{(m-1)}L_1^{(m-1)}}_{L^{-1}}
underbrace{P_{m-1}ldots P_2P_1}_PA = U,]
where we just need to keep track of the permutations in the (L)
matrices as we go through the Gaussian elimination stages. These (L)
matrices have the same structure as the basic LU factorisation, and hence
we obtain
[L^{-1}PA = U implies PA = LU.]
This is equivalent to permuting the rows of (A) using (P) and then
finding the LU factorisation using the basic algorithm (except we
can’t implement it like that because we only decide how to build (P)
during the Gaussian elimination process).
Supplementary video
The LU factorisation with pivoting can be expressed in the following
pseudo-code.
-
(Ugets A)
-
(Lgets I)
-
(Pgets I)
-
FOR (k=1) TO (m-1)
-
Choose (igeq k) to maximise (|u_{ik}|)
-
(u_{k,k:m} leftrightarrow u_{i,k:m}) (row swaps)
-
(l_{k,1:k-1} leftrightarrow l_{i,1:k-1}) (row swaps)
-
(p_{k,1:m} leftrightarrow p_{i,1:m})
-
FOR (j=k+1) TO (m)
-
(l_{jk} gets u_{jk}/u_{kk})
-
(u_{j,k:m} gets u_{j,k:m} — l_{jk}u_{k,k:m})
-
-
END FOR
-
-
END FOR
Supplementary video
To solve a system (Ax=b) given the a pivoted LU factorisation (PA=LU),
we left multiply the equation by (P) and use the factorisation get
(LUx=Pb). The procedure is then as before, but (b) must be permuted to
(Pb) before doing the forwards and back substitutions.
We call this strategy partial pivoting. In contrast, complete
pivoting additionally employs permutations (Q_k) on the right that
swap columns of (A_k) as well as the rows swapped by the permutations
(P_k). By similar arguments, one can obtain the LU factorisation with
complete pivoting, (PAQ=LU).
Exercise 4.7
((ddagger)) The function cla_utils.exercises7.perm()
has
been left unimplemented. It should take an (mtimes m) permutation
matrix (P), stored as an (integer-valued) array of indices
(pinmathbb{N}^m) so that ((Px)_i = x_{p_i}), (i=1,2,ldots, m),
and replace it with the matrix (P_{i,j}P) (also stored as a array
of indices) where (P_{i,j}) is the permutation matrix that
exchanges the entries (i) and (j). The test script
test_exercises7.py
in the test
directory will test this
function.
Exercise 4.8
((ddagger)) The function cla_utils.exercises7.LUP_inplace()
has been left unimplemented. It should extend the in-place
algorithm for LU factorisation (with the outer-product formulation,
if you managed it) to the LUP factorisation. As well as computing L
and U “in place” in the array where the input A is stored, it will
compute a permutation matrix, which can and should be constructed
using cla_utils.exercises7.perm()
.The test script
test_exercises7.py
in the test
directory will test this
function.
Exercise 4.9
((ddagger)) The function cla_utils.exercises7.solve_LUP()
has been left unimplemented. It should use the LUP code that you
have written to solve the equation (Ax=b) for (x) given inputs (A)
and (b). The test script test_exercises7.py
in the test
directory will test this function.
Exercise 4.10
((ddagger)) Show how to compute the determinant of (A) from the
LUP factorisation in (mathcal{O}(m)) time (having already
constructed the LUP factorisation which costs
(mathcal{O}(m^3))). Complete the function
cla_utils.exercises7.det_LUP()
to implement this
computation. The test script test_exercises7.py
in the test
directory will test this function.
4.3. Stability of LU factorisation¶
Supplementary video
To characterise the stability of LU factorisation, we quote the following
result.
Theorem 4.11
Let (tilde{L}) and (tilde{U}) be the result of the Gaussian
elimination algorithm implemented in a floating point number system
satisfying axioms I and II. If no zero pivots are encountered, then
[tilde{L}tilde{U} = A + delta A]
where
[frac{|delta A|}{|L||U|} = mathcal{O}(varepsilon),]
for some perturbation (delta A).
The algorithm is backward stable if (|L||U|=mathcal{O}(|A|)),
but there will be problems if (|L||U|gg |A|). For a proof of this
result, see the textbook by Golub and van Loan.
A similar result exists for pivoted LU. The main extra issue is that
small changes could potentially lead to a different pivoting matrix
(tilde{P}) which is then (O(1)) different from (P). This is characterised
in the following result (which we also do not prove).
Theorem 4.12
Let (tilde{P}), (tilde{L}) and (tilde{U}) be the result of the
partial pivoted Gaussian elimination algorithm implemented in a
floating point number system satisfying axioms I and II. If no zero
pivots are encountered, then
[tilde{L}tilde{U} = tilde{P}A + delta A]
where
[frac{|delta A|}{|A|} = mathcal{O}(rhovarepsilon),]
for some perturbation (delta A), and where (rho) is the growth
factor,
[rho = frac{max_{ij}|u_{ij}|}{max_{ij}|a_{ij}|}.]
Thus, partial pivoting (and complete pivoting turns out not to help
much extra) can keep the entries in (L) under control, but there can
still be pathological cases where entries in (U) can get large,
leading to large (rho) and unstable computations.
4.4. Taking advantage of matrix structure¶
Supplementary video
The cost of the standard Gaussian elimination algorithm to form (L)
and (U) is (mathcal{O}(m^3)), which grows rather quickly as (m)
increases. If there is structure in the matrix, then we can often
exploit this to reduce the cost. Understanding when and how to exploit
structure is a central theme in computational linear algebra.
Here we will discuss some examples of structure to be exploited.
When (A) is a lower or upper triangular matrix then we can use
forwards or back substitution, with (mathcal{O}(m^2)) operation count
as previously discussed.
When (A) is a diagonal matrix, i.e. (A_{ij}=0) for (ine j), it only
has (m) nonzero entries, that can be stored as a vector,
((A_{11},A_{22},ldots,A_{mm})). In this case, (Ax=b) can be solved in
(m) operations, just by setting (x_i=b_i/A_{ii}), for
(i=1,2,ldots,m).
Similarly, if (A in mathcal{C}^{dmtimes dm}) is block diagonal,
i.e.
[begin{split}A = begin{pmatrix}
B_{1} & 0 & ldots & 0 \
0 & B_{2} & ldots & 0 \
vdots & vdots & ddots & 0 \
0 & 0 & ldots & B_{m}
end{pmatrix},end{split}]
where (B_{i}inmathcal{C}^{dtimes d}) for (i=1,2,ldots,m). The inverse
of (A) is
[begin{split}A = begin{pmatrix}
B_{1}^{-1} & 0 & ldots & 0 \
0 & B_{2}^{-1} & ldots & 0 \
vdots & vdots & ddots & 0 \
0 & 0 & ldots & B_{m}^{-1}
end{pmatrix}.end{split}]
A generalisation of a diagonal matrix is a banded matrix, where
(A_{ij}=0) for (i>j+p) and for (i<j-q). We call (p) the lower
bandwidth of (A); (q) is the upper bandwidth. When the matrix is
banded, there are already zeros below the diagonal of (A), so we know
that the corresponding entries in the (L_k) matrices will be zero.
Further, because there are zeros above the diagonal of (A), these do
not need to be updated when applying the row operations to those
zeros.
Exercise 4.13
Construct the (100times 100) matrix (A) as follows: take (A=3I),
then set (A_{1,i}=1), for (i=1,ldots,100). Then set (A_{i,1}=i) for
(i=1,ldots,100). Using your own LU factorisation, compute the LU
factorisation of (A). What
do you observe about the number of non-zero entries in (L) and (U)?
Explain this using what you have just learned about banded
matrices. Can the situation be improved by pivoting? (Just think about
it, don’t need to implement it.)
The Gaussian elimination algorithm (without pivoting) for a banded
matrix is given as pseudo-code below.
-
(U gets A)
-
(L gets I)
-
FOR (k=1) TO (m-1)
-
FOR (j=k+1) TO (min(k+p,m))
-
(l_{jk} gets u_{jk}/u_{kk})
-
(n gets min(k+q, m))
-
(u_{j,k:n} gets u_{j,k:n}- l_{jk}u_{k,k:n})
-
-
END FOR
-
-
END FOR
The operation count for this banded matrix algorithm is
(mathcal{O}(mpq)), which is linear in (m) instead of cubic!
Further, the resulting matrix (L) has lower bandwidth (p)
and (U) has upper bandwidth (q). This means that we can also
exploit this structure in the forward and back substitution
algorithms as well. For example, the forward substitution algorithm
is given as pseudo-code below.
-
(x_1 gets b_1/L_{11})
-
FOR (k=2) TO (m)
-
(j gets max(1, k-p))
-
(x_k gets frac{b_k -L_{k,j:k-1}x_{j:k-1}}{L_{kk}})
-
-
END FOR
This has an operation count (mathcal{O}(mp)). The story is
very similar for the back substitution.
Supplementary video
Another example that we have already encountered is unitary matrices
(Q). Since (Q^{-1}=Q^*), solving the system (Qx=b) is just the cost of
applying (Q^*), with operation count (mathcal{O}(m^2)).
An important matrix that we shall encounter later is an upper
Hessenberg matrix, that has a lower bandwidth of 1, but no particular
zero structure above the diagonal. In this case, the (L) matrix is
still banded (with lower bandwidth 1) but the (U) matrix is not. This
means that there are still savings due to the zeros in (L), but work
has to be done on the entire column of (U) above the diagonal, and so
solving an upper Hessenberg system has operation count
(mathcal{O}(m^2)).
4.5. Cholesky factorisation¶
An example of extra structure which we shall discuss in a bit more
detail is the case of Hermitian positive definite matrices. Recall
that a Hermitian matrix satisfies (A^*=A), whilst positive definite
means that
[x^*Ax > 0, , forall |x|>0.]
When (A) is Hermitian positive definite, it is possible to find an
upper triangular matrix (R) such that (A=R^*R), which is called the
Cholesky factorisation. To show that it is possible to compute
the Cholesky factorisation, we start by assuming that (A) has
a 1 in the top-left hand corner, so that
[begin{split}A = begin{pmatrix}
1 & w^* \
w & K \
end{pmatrix}end{split}]
where (w) is a (m-1) vector containing the rest of the first column
of (A), and (K) is an ((m-1)times(m-1)) Hermitian positive
definite matrix. (Exercise: show that (K) is Hermitian positive
definite.)
After one stage of Gaussian elimination, we have
[begin{split}underbrace{begin{pmatrix}
1 & 0 \
-w & I \
end{pmatrix}}_{L_1^{-1}}
underbrace{
begin{pmatrix}
1 & w^* \
w & K \
end{pmatrix}}_{A}
=
begin{pmatrix}
1 & w^* \
0 & K — ww^* \
end{pmatrix}.end{split}]
Further,
[begin{split}begin{pmatrix}
1 & w^* \
0 & K — ww^* \
end{pmatrix}=
underbrace{
begin{pmatrix}
1 & 0 \
0 & K — ww^* \
end{pmatrix}}_{A_1}
underbrace{
begin{pmatrix}
1 & w^T \
0 & I \
end{pmatrix}}_{(L_1^{-1})^*=L_1^{-*}},end{split}]
so that (A = L_1^{-1}A_1L_1^{-*}). If (a_{11} neq 1), we at least
know that (a_{11}= e_1^*Ae_1>0), and the factorisation becomes
[begin{split}A =
underbrace{begin{pmatrix} alpha & 0 \
w/alpha & I \
end{pmatrix}}_{R_1^T}
underbrace{
begin{pmatrix}
1 & 0 \
0 & K — frac{ww^*}{a_{11}} \
end{pmatrix}}_{A_1}
underbrace{
begin{pmatrix}
alpha & w/alpha \
0 & I \
end{pmatrix}}_{R_1},end{split}]
where (alpha=sqrt{a_{11}}). We can check that (A_1) is positive
definite, since
[x^*A_1x = x^*R_1^{-*}AR_1x = (R_1^{-1}x)^*AR_1x = y^*Ay > 0, mbox{ where }
y = R_1x.]
Hence, (K-{ww^*}/{a_{11}}) is positive definite, since
[begin{split}r^*left(K-frac{ww^*}{a_{11}}right)r = begin{pmatrix} 0 \ r \ end{pmatrix}^*
A_1 begin{pmatrix} 0 \ r \ end{pmatrix} > 0,end{split}]
and hence we can now perform the same procedure all over again to (K —
{ww^*}/a_{11}). By induction we can always continue until we have the
required Cholesky factorisation, which is unique (since there were no
choices to be made at any step).
We can then present the Cholesky factorisation as pseudo-code.
-
(Rgets A)
-
FOR (k=1) TO (m)
-
FOR (j=k+1) to (m)
-
(R_{j,j:m} gets R_{j,j:m} — R_{k,j:m}bar{R}_{kj}/R_{kk})
-
-
(R_{k,k:m} gets R_{k,k:m}/sqrt{R_{k:k}})
-
The operation count of the Cholesky factorisation is dominated
by the operation inside the (j) loop, which has one division,
(m-j+1) multiplications, and (m-j+1) subtractions, giving
(sim 2(m-j)) FLOPs. The total operation count is then
[N_{mbox{FLOPs}} = sum_{k=1}^msum_{j=k+1}^m
sim frac{1}{3}m^3.]
In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix (see matrix decomposition). The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish mathematician Tadeusz Banachiewicz in 1938.[1] It’s also referred to as LR decomposition (factors into left and right triangular matrices).
Definitions[edit]
Let A be a square matrix. An LU factorization refers to the factorization of A, with proper row and/or column orderings or permutations, into two factors – a lower triangular matrix L and an upper triangular matrix U:
In the lower triangular matrix all elements above the diagonal are zero, in the upper triangular matrix, all the elements below the diagonal are zero. For example, for a 3 × 3 matrix A, its LU decomposition looks like this:
Without a proper ordering or permutations in the matrix, the factorization may fail to materialize. For example, it is easy to verify (by expanding the matrix multiplication) that . If
, then at least one of
and
has to be zero, which implies that either L or U is singular. This is impossible if A is nonsingular (invertible). This is a procedural problem. It can be removed by simply reordering the rows of A so that the first element of the permuted matrix is nonzero. The same problem in subsequent factorization steps can be removed the same way; see the basic procedure below.
LU factorization with partial pivoting[edit]
It turns out that a proper permutation in rows (or columns) is sufficient for LU factorization. LU factorization with partial pivoting (LUP) refers often to LU factorization with row permutations only:
where L and U are again lower and upper triangular matrices, and P is a permutation matrix, which, when left-multiplied to A, reorders the rows of A. It turns out that all square matrices can be factorized in this form,[2] and the factorization is numerically stable in practice.[3] This makes LUP decomposition a useful technique in practice.
LU factorization with full pivoting[edit]
An LU factorization with full pivoting involves both row and column permutations:
where L, U and P are defined as before, and Q is a permutation matrix that reorders the columns of A.[4]
Lower-diagonal-upper (LDU) decomposition[edit]
A Lower-diagonal-upper (LDU) decomposition is a decomposition of the form
where D is a diagonal matrix, and L and U are unitriangular matrices, meaning that all the entries on the diagonals of L and U are one.
Rectangular matrices[edit]
Above we required that A be a square matrix, but these decompositions can all be generalized to rectangular matrices as well.[5] In that case, L and D are square matrices both of which have the same number of rows as A, and U has exactly the same dimensions as A. Upper triangular should be interpreted as having only zero entries below the main diagonal, which starts at the upper left corner. Similarly, the more precise term for U is that it is the «row echelon form» of the matrix A.
Example[edit]
We factor the following 2-by-2 matrix:
One way to find the LU decomposition of this simple matrix would be to simply solve the linear equations by inspection. Expanding the matrix multiplication gives
This system of equations is underdetermined. In this case any two non-zero elements of L and U matrices are parameters of the solution and can be set arbitrarily to any non-zero value. Therefore, to find the unique LU decomposition, it is necessary to put some restriction on L and U matrices. For example, we can conveniently require the lower triangular matrix L to be a unit triangular matrix (i.e. set all the entries of its main diagonal to ones). Then the system of equations has the following solution:
Substituting these values into the LU decomposition above yields
Existence and uniqueness[edit]
Square matrices[edit]
Any square matrix admits LUP and PLU factorizations.[2] If
is invertible, then it admits an LU (or LDU) factorization if and only if all its leading principal minors[6] are nonzero[7] (for example
does not admit an LU or LDU factorization). If is a singular matrix of rank
, then it admits an LU factorization if the first
leading principal minors are nonzero, although the converse is not true.[8]
If a square, invertible matrix has an LDU (factorization with all diagonal entries of L and U equal to 1), then the factorization is unique.[7] In that case, the LU factorization is also unique if we require that the diagonal of (or
) consists of ones.
In general, any square matrix could have one of the following:
- a unique LU factorization (as mentioned above)
- infinitely many LU factorizations if two or more of any first (n−1) columns are linearly dependent or any of the first (n−1) columns are 0, then A has infinitely many LU factorizations.
- no LU factorization if the first (n−1) columns are non-zero and linearly independent and at least one leading principal minor is zero.
In Case 3, one can approximate an LU factorization by changing a diagonal entry to
to avoid a zero leading principal minor.[9]
Symmetric positive-definite matrices[edit]
If A is a symmetric (or Hermitian, if A is complex) positive-definite matrix, we can arrange matters so that U is the conjugate transpose of L. That is, we can write A as
This decomposition is called the Cholesky decomposition. The Cholesky decomposition always exists and is unique — provided the matrix is positive definite. Furthermore, computing the Cholesky decomposition is more efficient and numerically more stable than computing some other LU decompositions.
General matrices[edit]
For a (not necessarily invertible) matrix over any field, the exact necessary and sufficient conditions under which it has an LU factorization are known. The conditions are expressed in terms of the ranks of certain submatrices. The Gaussian elimination algorithm for obtaining LU decomposition has also been extended to this most general case.[10]
Algorithms[edit]
Closed formula[edit]
When an LDU factorization exists and is unique, there is a closed (explicit) formula for the elements of L, D, and U in terms of ratios of determinants of certain submatrices of the original matrix A.[11] In particular, , and for
,
is the ratio of the
-th principal submatrix to the
-th principal submatrix. Computation of the determinants is computationally expensive, so this explicit formula is not used in practice.
Using Gaussian elimination[edit]
The following algorithm is essentially a modified form of Gaussian elimination. Computing an LU decomposition using this algorithm requires floating-point operations, ignoring lower-order terms. Partial pivoting adds only a quadratic term; this is not the case for full pivoting.[12]
Procedure[edit]
Given an N × N matrix , define
as the matrix
in which the necessary rows have been swapped to meet the desired conditions (such as partial pivoting) for the 1st column. The parenthetical superscript (e.g.,
) of the matrix
is the version of the matrix. The matrix
is the
matrix in which the elements below the main diagonal have already been eliminated to 0 through Gaussian elimination for the first
columns, and the necessary rows have been swapped to meet the desired conditions for the
column.
We perform the operation for each row
with elements (labelled as
where
) below the main diagonal in the n-th column of
. For this operation,
We perform these row operations to eliminate the elements to zero. Once we have subtracted these rows, we may swap rows to provide the desired conditions for the
column. We may swap rows here to perform partial pivoting, or because the element
on the main diagonal is zero (and therefore cannot be used to implement Gaussian elimination).
We define the final permutation matrix as the identity matrix which has all the same rows swapped in the same order as the
matrix.
Once we have performed the row operations for the first columns, we have obtained an upper triangular matrix
which is denoted by
. Note, we can denote
as
because the N-th column of
has no conditions for which rows need to be swapped. We can also calculate the lower triangular matrix denoted as
, such that
, by directly inputting the values of values of
via the formula below.
Example[edit]
If we are given the matrix
we will chose to implement partial pivoting and thus swap the first and second row so that our matrix and the first iteration of our
matrix respectively become
Once we have swapped the rows, we can eliminate the elements below the main diagonal on the first column by performing
such that,
Once these rows have been subtracted, we have derived from the matrix
Because we are implementing partial pivoting, we swap the second and third rows of our derived matrix and the current version of our matrix respectively to obtain
Now, we eliminate the elements below the main diagonal on the second column by performing such that
. Because no non-zero elements exist below the main diagonal in our current iteration of
after this row subtraction, this row subtraction derives our final
matrix (denoted as
) and final
matrix:
Now we can obtain our final matrix:
Now these matrices have a relation such that .
Relations when no rows are swapped[edit]
If we did not swap rows at all during this process, we can perform the row operations simultaneously for each column by setting
where
is the N × N identity matrix with its n-th column replaced by the transposed vector
In other words, the lower triangular matrix
Performing all the row operations for the first columns using the
formula is equivalent to finding the decomposition
Denote so that
.
Now let’s compute the sequence of . We know that
has the following formula.
If there are two lower triangular matrices with 1s in the main diagonal, and neither have a non-zero item below the main diagonal in the same column as the other, then we can include all non-zero items at their same location in the product of the two matrices. For example:
Finally, multiply together and generate the fused matrix denoted as
(as previously mentioned). Using the matrix
, we obtain
It is clear that in order for this algorithm to work, one needs to have at each step (see the definition of
). If this assumption fails at some point, one needs to interchange n-th row with another row below it before continuing. This is why an LU decomposition in general looks like
.
LU Crout decomposition[edit]
Note that the decomposition obtained through this procedure is a Doolittle decomposition: the main diagonal of L is composed solely of 1s. If one would proceed by removing elements above the main diagonal by adding multiples of the columns (instead of removing elements below the diagonal by adding multiples of the rows), we would obtain a Crout decomposition, where the main diagonal of U is of 1s.
Another (equivalent) way of producing a Crout decomposition of a given matrix A is to obtain a Doolittle decomposition of the transpose of A. Indeed, if is the LU-decomposition obtained through the algorithm presented in this section, then by taking
and
, we have that
is a Crout decomposition.
Through recursion[edit]
Cormen et al.[13] describe a recursive algorithm for LUP decomposition.
Given a matrix A, let P1 be a permutation matrix such that
,
where , if there is a nonzero entry in the first column of A; or take P1 as the identity matrix otherwise. Now let
, if
; or
otherwise. We have
Now we can recursively find an LUP decomposition . Let
. Therefore
which is an LUP decomposition of A.
Randomized algorithm[edit]
It is possible to find a low rank approximation to an LU decomposition using a randomized algorithm. Given an input matrix and a desired low rank
, the randomized LU returns permutation matrices
and lower/upper trapezoidal matrices
of size
and
respectively, such that with high probability
, where
is a constant that depends on the parameters of the algorithm and
is the
-th singular value of the input matrix
.[14]
Theoretical complexity[edit]
If two matrices of order n can be multiplied in time M(n), where M(n) ≥ na for some a > 2, then an LU decomposition can be computed in time O(M(n)).[15] This means, for example, that an O(n2.376) algorithm exists based on the Coppersmith–Winograd algorithm.
Sparse-matrix decomposition[edit]
Special algorithms have been developed for factorizing large sparse matrices. These algorithms attempt to find sparse factors L and U. Ideally, the cost of computation is determined by the number of nonzero entries, rather than by the size of the matrix.
These algorithms use the freedom to exchange rows and columns to minimize fill-in (entries that change from an initial zero to a non-zero value during the execution of an algorithm).
General treatment of orderings that minimize fill-in can be addressed using graph theory.
Applications[edit]
Solving linear equations[edit]
Given a system of linear equations in matrix form
we want to solve the equation for x, given A and b. Suppose we have already obtained the LUP decomposition of A such that , so
.
In this case the solution is done in two logical steps:
- First, we solve the equation
for y.
- Second, we solve the equation
for x.
In both cases we are dealing with triangular matrices (L and U), which can be solved directly by forward and backward substitution without using the Gaussian elimination process (however we do need this process or equivalent to compute the LU decomposition itself).
The above procedure can be repeatedly applied to solve the equation multiple times for different b. In this case it is faster (and more convenient) to do an LU decomposition of the matrix A once and then solve the triangular matrices for the different b, rather than using Gaussian elimination each time. The matrices L and U could be thought to have «encoded» the Gaussian elimination process.
The cost of solving a system of linear equations is approximately floating-point operations if the matrix
has size
. This makes it twice as fast as algorithms based on QR decomposition, which costs about
floating-point operations when Householder reflections are used. For this reason, LU decomposition is usually preferred.[16]
Inverting a matrix[edit]
When solving systems of equations, b is usually treated as a vector with a length equal to the height of matrix A. In matrix inversion however, instead of vector b, we have matrix B, where B is an n-by-p matrix, so that we are trying to find a matrix X (also a n-by-p matrix):
We can use the same algorithm presented earlier to solve for each column of matrix X. Now suppose that B is the identity matrix of size n. It would follow that the result X must be the inverse of A.[17]
Computing the determinant[edit]
Given the LUP decomposition of a square matrix A, the determinant of A can be computed straightforwardly as
The second equation follows from the fact that the determinant of a triangular matrix is simply the product of its diagonal entries, and that the determinant of a permutation matrix is equal to (−1)S where S is the number of row exchanges in the decomposition.
In the case of LU decomposition with full pivoting, also equals the right-hand side of the above equation, if we let S be the total number of row and column exchanges.
The same method readily applies to LU decomposition by setting P equal to the identity matrix.
Code examples[edit]
C code example[edit]
/* INPUT: A - array of pointers to rows of a square matrix having dimension N * Tol - small tolerance number to detect failure when the matrix is near degenerate * OUTPUT: Matrix A is changed, it contains a copy of both matrices L-E and U as A=(L-E)+U such that P*A=L*U. * The permutation matrix is not stored as a matrix, but in an integer vector P of size N+1 * containing column indexes where the permutation matrix has "1". The last element P[N]=S+N, * where S is the number of row exchanges needed for determinant computation, det(P)=(-1)^S */ int LUPDecompose(double **A, int N, double Tol, int *P) { int i, j, k, imax; double maxA, *ptr, absA; for (i = 0; i <= N; i++) P[i] = i; //Unit permutation matrix, P[N] initialized with N for (i = 0; i < N; i++) { maxA = 0.0; imax = i; for (k = i; k < N; k++) if ((absA = fabs(A[k][i])) > maxA) { maxA = absA; imax = k; } if (maxA < Tol) return 0; //failure, matrix is degenerate if (imax != i) { //pivoting P j = P[i]; P[i] = P[imax]; P[imax] = j; //pivoting rows of A ptr = A[i]; A[i] = A[imax]; A[imax] = ptr; //counting pivots starting from N (for determinant) P[N]++; } for (j = i + 1; j < N; j++) { A[j][i] /= A[i][i]; for (k = i + 1; k < N; k++) A[j][k] -= A[j][i] * A[i][k]; } } return 1; //decomposition done } /* INPUT: A,P filled in LUPDecompose; b - rhs vector; N - dimension * OUTPUT: x - solution vector of A*x=b */ void LUPSolve(double **A, int *P, double *b, int N, double *x) { for (int i = 0; i < N; i++) { x[i] = b[P[i]]; for (int k = 0; k < i; k++) x[i] -= A[i][k] * x[k]; } for (int i = N - 1; i >= 0; i--) { for (int k = i + 1; k < N; k++) x[i] -= A[i][k] * x[k]; x[i] /= A[i][i]; } } /* INPUT: A,P filled in LUPDecompose; N - dimension * OUTPUT: IA is the inverse of the initial matrix */ void LUPInvert(double **A, int *P, int N, double **IA) { for (int j = 0; j < N; j++) { for (int i = 0; i < N; i++) { IA[i][j] = P[i] == j ? 1.0 : 0.0; for (int k = 0; k < i; k++) IA[i][j] -= A[i][k] * IA[k][j]; } for (int i = N - 1; i >= 0; i--) { for (int k = i + 1; k < N; k++) IA[i][j] -= A[i][k] * IA[k][j]; IA[i][j] /= A[i][i]; } } } /* INPUT: A,P filled in LUPDecompose; N - dimension. * OUTPUT: Function returns the determinant of the initial matrix */ double LUPDeterminant(double **A, int *P, int N) { double det = A[0][0]; for (int i = 1; i < N; i++) det *= A[i][i]; return (P[N] - N) % 2 == 0 ? det : -det; }
C# code example[edit]
public class SystemOfLinearEquations { public double[] SolveUsingLU(double[,] matrix, double[] rightPart, int n) { // decomposition of matrix double[,] lu = new double[n, n]; double sum = 0; for (int i = 0; i < n; i++) { for (int j = i; j < n; j++) { sum = 0; for (int k = 0; k < i; k++) sum += lu[i, k] * lu[k, j]; lu[i, j] = matrix[i, j] - sum; } for (int j = i + 1; j < n; j++) { sum = 0; for (int k = 0; k < i; k++) sum += lu[j, k] * lu[k, i]; lu[j, i] = (1 / lu[i, i]) * (matrix[j, i] - sum); } } // lu = L+U-I // find solution of Ly = b double[] y = new double[n]; for (int i = 0; i < n; i++) { sum = 0; for (int k = 0; k < i; k++) sum += lu[i, k] * y[k]; y[i] = rightPart[i] - sum; } // find solution of Ux = y double[] x = new double[n]; for (int i = n - 1; i >= 0; i--) { sum = 0; for (int k = i + 1; k < n; k++) sum += lu[i, k] * x[k]; x[i] = (1 / lu[i, i]) * (y[i] - sum); } return x; } }
MATLAB code example[edit]
function LU = LUDecompDoolittle(A) n = length(A); LU = A; % decomposition of matrix, Doolittle’s Method for i = 1:1:n for j = 1:(i - 1) LU(i,j) = (LU(i,j) - LU(i,1:(j - 1))*LU(1:(j - 1),j)) / LU(j,j); end j = i:n; LU(i,j) = LU(i,j) - LU(i,1:(i - 1))*LU(1:(i - 1),j); end %LU = L+U-I end function x = SolveLinearSystem(LU, B) n = length(LU); y = zeros(size(B)); % find solution of Ly = B for i = 1:n y(i,:) = B(i,:) - LU(i,1:i)*y(1:i,:); end % find solution of Ux = y x = zeros(size(B)); for i = n:(-1):1 x(i,:) = (y(i,:) - LU(i,(i + 1):n)*x((i + 1):n,:))/LU(i, i); end end A = [ 4 3 3; 6 3 3; 3 4 3 ] LU = LUDecompDoolittle(A) B = [ 1 2 3; 4 5 6; 7 8 9; 10 11 12 ]' x = SolveLinearSystem(LU, B) A * x
See also[edit]
- Block LU decomposition
- Bruhat decomposition
- Cholesky decomposition
- Crout matrix decomposition
- Incomplete LU factorization
- LU Reduction
- Matrix decomposition
- QR decomposition
Notes[edit]
- ^ Schwarzenberg-Czerny, A. (1995). «On matrix factorization and efficient least squares solution». Astronomy and Astrophysics Supplement Series. 110: 405. Bibcode:1995A&AS..110..405S.
- ^ a b Okunev & Johnson (1997), Corollary 3.
- ^ Trefethen & Bau (1997), p. 166.
- ^ Trefethen & Bau (1997), p. 161.
- ^ Lay, David C. (2016). Linear algebra and its applications. Steven R. Lay, Judith McDonald (Fifth ed.). Harlow. p. 142. ISBN 1-292-09223-8. OCLC 920463015.
- ^ Rigotti (2001), Leading Principle Minor
- ^ a b Horn & Johnson (1985), Corollary 3.5.5
- ^ Horn & Johnson (1985), Theorem 3.5.2
- ^ Nhiayi, Ly; Phan-Yamada, Tuyetdong (2021). «Examining Possible LU Decomposition». North American GeoGebra Journal. 9 (1).
- ^ Okunev & Johnson (1997)
- ^ Householder (1975)
- ^ Golub & Van Loan (1996), p. 112, 119.
- ^ Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). Introduction to Algorithms. MIT Press and McGraw-Hill. ISBN 978-0-262-03293-3.
- ^ Shabat, Gil; Shmueli, Yaniv; Aizenbud, Yariv; Averbuch, Amir (2016). «Randomized LU Decomposition». Applied and Computational Harmonic Analysis. 44 (2): 246–272. arXiv:1310.7202. doi:10.1016/j.acha.2016.04.006. S2CID 1900701.
- ^ Bunch & Hopcroft (1974)
- ^ Trefethen & Bau (1997), p. 152.
- ^ Golub & Van Loan (1996), p. 121
References[edit]
- Bunch, James R.; Hopcroft, John (1974), «Triangular factorization and inversion by fast matrix multiplication», Mathematics of Computation, 28 (125): 231–236, doi:10.2307/2005828, ISSN 0025-5718, JSTOR 2005828.
- Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), Introduction to Algorithms, MIT Press and McGraw-Hill, ISBN 978-0-262-03293-3.
- Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations (3rd ed.), Baltimore: Johns Hopkins, ISBN 978-0-8018-5414-9.
- Horn, Roger A.; Johnson, Charles R. (1985), Matrix Analysis, Cambridge University Press, ISBN 978-0-521-38632-6. See Section 3.5. N − 1
- Householder, Alston S. (1975), The Theory of Matrices in Numerical Analysis, New York: Dover Publications, MR 0378371.
- Okunev, Pavel; Johnson, Charles R. (1997), Necessary And Sufficient Conditions For Existence of the LU Factorization of an Arbitrary Matrix, arXiv:math.NA/0506382.
- Poole, David (2006), Linear Algebra: A Modern Introduction (2nd ed.), Canada: Thomson Brooks/Cole, ISBN 978-0-534-99845-5.
- Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (2007), «Section 2.3», Numerical Recipes: The Art of Scientific Computing (3rd ed.), New York: Cambridge University Press, ISBN 978-0-521-88068-8.
- Trefethen, Lloyd N.; Bau, David (1997), Numerical linear algebra, Philadelphia: Society for Industrial and Applied Mathematics, ISBN 978-0-89871-361-9.
- Rigotti, Luca (2001), ECON 2001 — Introduction to Mathematical Methods, Lecture 8
External links[edit]
References
- LU decomposition on MathWorld.
- LU decomposition on Math-Linux.
- LU decomposition at Holistic Numerical Methods Institute
- LU matrix factorization. MATLAB reference.
Computer code
- LAPACK is a collection of FORTRAN subroutines for solving dense linear algebra problems
- ALGLIB includes a partial port of the LAPACK to C++, C#, Delphi, etc.
- C++ code, Prof. J. Loomis, University of Dayton
- C code, Mathematics Source Library
- Rust code
- LU in X10
Online resources
- WebApp descriptively solving systems of linear equations with LU Decomposition
- Matrix Calculator with steps, including LU decomposition,
- LU Decomposition Tool, uni-bonn.de
- LU Decomposition by Ed Pegg, Jr., The Wolfram Demonstrations Project, 2007.