December 28, 2025
At this point we’ve implemented three solvers for solving linear systems.
The GaussElim() function is general purpose, but is slow because the row-reduction algorithm is \(O\left(n^3\right)\).
The LU solvers we built are an improvement over Gaussian elimination, if used correctly. The LUdecomp() function will decompose a coefficient matrix into the product between a lower- and upper-triangular matrix, and then LUsolve() will solve the corresponding \(LU\vec{x} = \vec{b}\) system.
LUsolve().Our tridiagonal solver implements LU decomposition in a special case where the coefficient matrix is tridiagonal, resulting in a significant space savings.
Example: We’ll solve the matrix equations \(A\vec{x} = \vec{b_i}\) for a tridiagonal matrix \(A\) and several constraint vectors \(\vec{b_i}\).
\[A = \begin{bmatrix} 2 & -1 & 0 & 0 & 0\\ -1 & 2 & -1 & 0 & 0\\ 0 & -1 & 1 & 3 & 0\\ 0 & 0 & -4 & 3 & 1\\ 0 & 0 & 0 & -2 & 4\end{bmatrix}~~~~\vec{b_1} = \begin{bmatrix} 1\\ -8\\ 5\\ 13\\ 4\end{bmatrix},~~\vec{b_2} = \begin{bmatrix} -3\\ 7\\ 17\\ 11\\ -4\end{bmatrix},~~\vec{b_3} = \begin{bmatrix} -11\\ 13\\ 25\\ -35\\ 12\end{bmatrix},~~\vec{b_4} = \begin{bmatrix} -23\\ -17\\ 11\\ 42\\ 13\end{bmatrix}\]
tic = time.time()
c = np.array([-1.0, -1, -4, -2])
d = np.array([2.0, 2, 1, 3, 4])
e = np.array([-1.0, -1, 3, 1])
lam, d, e = DoolittleLUdecomp3(c, d, e)
x1 = Doolittle3solver(lam, d, e, b1)
x2 = Doolittle3solver(lam, d, e, b2)
x3 = Doolittle3solver(lam, d, e, b3)
x4 = Doolittle3solver(lam, d, e, b4)
toc = time.time()
print("That took ", toc - tic, " seconds.")That took 0.008213043212890625 seconds.
Example: Solve the linear system \(\left\{\begin{array}{lcl} -x_2 + x_3 & = & 0\\ -x_1 + 2x_2 - x_3 & = & 0\\ 2x_1 - x_2 & = & 1\end{array}\right.\)
Because of rounding errors, the order of the rows in a coefficient matrix can greatly impact performance and accuracy of results produced by a numerical solver.
The system
\[\left\{\begin{array}{lcl} -x_2 + x_3 & = & 0\\ -x_1 + 2x_2 - x_3 & = & 0\\ 2x_1 - x_2 & = & 1\end{array}\right.\]
has the augmented coefficient matrix
\[\left[\begin{array}{ccc|c} 0 & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\\\end{array}\right]\]
This augmented coefficient matrix has the rows in the “wrong order”.
Our methods all fail because there is a \(0\) in the pivot position.
In physically looking at the system, we notice that a row swap is advantageous, but how do we trigger a row-swap algorithmically? Why is a row-swap beneficial here?
Sometimes it is imperative that we reorder the rows in a system during the elimination phase – this is called row pivoting.
In general, we’ll want to do this when the pivot element is \(0\) or when it is very small compared to the other elements in the pivot row.
Consider the example augmented coefficient matrix to the right with \(\varepsilon\) representing a very small number.
\[\begin{align*}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right] \end{align*}\]
Sometimes it is imperative that we reorder the rows in a system during the elimination phase – this is called row pivoting.
In general, we’ll want to do this when the pivot element is \(0\) or when it is very small compared to the other elements in the pivot row.
Consider the example augmented coefficient matrix to the right with \(\varepsilon\) representing a very small number.
\[\begin{align*}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right] &\stackrel{R_2 \leftarrow R_2 - (-1/\varepsilon)R_1}{\to}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & 2 - (1/\varepsilon) & -1 + (1/\varepsilon) & 0\\ 2 & -1 & 0 & 1\end{array}\right] \end{align*}\]
Sometimes it is imperative that we reorder the rows in a system during the elimination phase – this is called row pivoting.
In general, we’ll want to do this when the pivot element is \(0\) or when it is very small compared to the other elements in the pivot row.
Consider the example augmented coefficient matrix to the right with \(\varepsilon\) representing a very small number.
\[\begin{align*}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right] &\stackrel{R_2 \leftarrow R_2 - (-1/\varepsilon)R_1}{\to}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & 2 - (1/\varepsilon) & -1 + (1/\varepsilon) & 0\\ 2 & -1 & 0 & 1\end{array}\right]\\ &\stackrel{R_3 \leftarrow R_3 - (2/\varepsilon)R_1}{\to}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & 2 - (1/\varepsilon) & -1 + (1/\varepsilon) & 0\\ 0 & -1 + (2/\varepsilon) & -2/\varepsilon & 1\end{array}\right] \end{align*}\]
Notice though that since \(\varepsilon\) is very small, \(1/\varepsilon\) is very large.
We may be in danger of entering roundoff error territory. The resulting augmented coefficient may be stored as
\[\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & - 1/\varepsilon & 1/\varepsilon & 0\\ 0 & 2/\varepsilon & -2/\varepsilon & 1\end{array}\right]\]
In this case, \(R_2\) and \(R_3\) contradict eachother and we would identify no solutions, but the solution should exist.
We’ve identified some scenarios where row-pivoting is beneficial – when our pivot is \(0\) or when it is “very small”.
What constitutes “very small” is subjective, so we’ll need a way to formalize what we mean so that we can write it into our algorithm. We need a proxy for “safe pivots” that a computer can detect.
An \(n\times n\) matrix \(A\) is said to be diagonally dominant if each diagonal element is greater than the sum of the absolute values of the other elements in its row.
\[\left|A_{ii}\right| > \sum_{\substack{j = 1\\ j\neq i}}^{n}{\left|A_{ij}\right|}\]
For example, the matrix \(\left[\begin{array}{ccc} -2 & 4 & -1\\ 1 & -1 & 3\\ 4 & -2 & 1\end{array}\right]\) is not diagonally dominant, but with two row pivots we obtain \(\left[\begin{array}{ccc} 4 & -2 & 1\\ -2 & 4 & -1\\ 1 & -1 & 3 \end{array}\right]\), which is diagonally dominant.
If \(A\) is diagonally dominant, then a numerical solution for \(A\vec{x} = \vec{b}\) does not benefit by row pivoting. Because of this, we’ll implement a strategy that reorders rows of \(A\) to achieve near-diagonal-dominance.
As usual when we are about to introduce a new algorithm, we’ll use an example to help illuminate the steps involved. We’ll use the example below.
Example: Use Gaussian Elimination with Scaled Row pivoting to solve the system \(A\vec{x} = \vec{b}\) where \(A = \left[\begin{array}{ccc} 2 & -2 & 6\\ -2 & 4 & 3\\ -1 & 8 & 4\end{array}\right]\) and \(\vec{b} = \left[\begin{array}{c} 16\\ 0\\ -1\end{array}\right]\).
Consider an approach to solving the system \(A\vec{x} = \vec{b}\) using Gaussian Elimination with row-pivoting.
That is, we row-pivot \(A\) during elimination so that the pivot element is as large as possible with respect to the other elements in the pivot row.
In order to do this, let’s define an array \(s\) as follows:
\[s_i = \max_{j}{\left|A_{ij}\right|}~~\text{for}~~i,j\in [n]\]
We call \(s_i\) the scale factor for \(R_i\), and it contains the absolute value of the largest element in the \(i^{th}\) row of \(A\).
We can write a simple routine to populate \(s\).
#populate scale-factor array
n = A.shape[0]
s = np.zeros(n)
for i in range(n):
s[i] = max(abs(A[i,:]))
Using s, our array of scale factors, we can compute the relative size of any matrix element to the largest element in its row.
That is, \(\displaystyle{r_{ij} = \frac{\left|A_{ij}\right|}{s_i}}\).
Suppose we are at the stage of the elimination phase where the \(k^{th}\) row has become the pivot row. The augmented coefficient matrix is as seen below.
We don’t immediately use \(A_{kk}\) as the next pivot element.
Instead, we search the rows below \(R_k\) for a “better” pivot in column \(k\).
\[\left[\begin{array}{ccccccc|c} A_{11} & A_{12} & A_{13} & \cdots & \cdots & \cdots & A_{1n} & b_1\\ 0 & A_{22} & A_{23} & \cdots & \cdots & \cdots & A_{2n} & b_2\\ 0 & 0 & A_{33} & \cdots & \cdots & \cdots & A_{3n} & b_3\\ \vdots & \vdots & \vdots & \ddots & \cdots & \vdots & \vdots\\ \hline 0 & 0 & \cdots & 0 & A_{kk} & \cdots & A_{kn} & b_k\\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & \cdots & 0 & A_{nk} & \cdots & A_{nn} & b_n\\ \end{array}\right]\]
The best choice is the entry with the largest relative size – that is, \(A_{pk}\) such that \(\displaystyle{r_{pk} = \max_{j}{(r_{jk})}}\) for \(j\geq k\).
If \(p \neq k\), then we execute a row exchange both in \(A\) and also exchange the corresponding elements in \(s\).
The code to the right uses a swapRows() function, which we’ll also write when we build our GaussPivot() function next.
This can be done as follows:
#row-pivot
for k in range(0, n-1):
p = argmax(abs(A[k:n, k])/s[k:n]) + k
if abs(A[p, k]) < tol:
print("Matrix is Singular")
return None
if p != k:
swapRows(b, k, p)
swapRows(s, k, p)
swapRows(A, k, p)
def swapRows(v, i, j):
if len(v.shape) == 1:
v[i], v[j] = v[j], v[i]
else:
v[[i, j], :] = v[[j, i], :]
def scaleFactors(A):
n = A.shape[0]
s = np.zeros(n)
for i in range(n):
s[i] = max(np.abs(A[i, :]))
return s
def GaussPivot(A, b, tol = 1.0e-12):
n = len(b)
s = scaleFactors(A)
for k in range(0, n-1):
p = np.argmax(np.abs(A[k:n, k])/s[k:n]) + k
if(abs(A[p, k]) < tol):
print("Matrix is Singular")
return None
#Row Pivot if necessary
if p != k:
swapRows(b, k, p)
swapRows(s, k, p)
swapRows(A, k, p)
#Elimination
for i in range(k+1, n):
if A[i, k] != 0.0:
lam = A[i, k]/A[k, k]
A[i, (k+1):n] = A[i, (k+1):n] - lam*A[k, (k+1):n]
b[i] = b[i] - lam*b[k]
if abs(A[n-1, n-1]) < tol:
print("Matrix is Singular")
return None
#back substitution
b[n-1] = b[n-1]/A[n-1, n-1]
for k in range(n-2, -1, -1):
b[k] = (b[k] - np.dot(A[k, (k+1):n], b[(k+1):n]))/A[k, k]
return bExample: Our original GaussElim() procedure couldn’t correctly solve the system below because of the order of the rows. Show that our new GaussPivot() method does solve the system correctly.
\[\left[\begin{array}{ccc|c} 0 & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right]\]
Our improved method easily handled this case by performing the necessary row swap.
Example: Use Gaussian Elimination with Scaled Row pivoting to solve the system \(A\vec{x} = \vec{b}\) where \(A = \left[\begin{array}{ccc} 2 & -2 & 6\\ -2 & 4 & 3\\ -1 & 8 & 4\end{array}\right]\) and \(\vec{b} = \left[\begin{array}{c} 16\\ 0\\ -1\end{array}\right]\).
There are some drawbacks to pivoting. In particular,
rowSwap() operations can be expensive (run time), andFortunately, in many applications where banded or symmetric matrices appear, those matrices are nearly diagonally dominant and so there is not often a benefit to pivoting anyway.
While not hard rules, banded matrices, symmetric matrices, and positive definite matrices seldom benefit from row pivoting.
Note that it is, however, possible to construct banded, symmetric, or positive definite matrices that do benefit from pivoting – they just don’t stem from real engineering problems.
We’ll leave linear algebra and visit techniques for curve fitting and function approximation from data.
Comments on Implementation
Again, our implementation involves two helper functions.
swapRows()function allows for swapping the rows of a matrix or elements of a vector.scaleFactors()function computes the array of scale factors for the matrix.The
GaussPivot()function takes in the coefficient matrix \(A\), the constraint vector \(\vec{b}\), and includes a preset but overridabletolerance.The
tolerance determines scenarios when the function should terminate due to rounding error concerns rather than reporting an untrustworthy numerical solution.This new
GaussPivot()function is the one which is most widely applicable, but it is not the fastest or most space-efficient.It makes
GaussElim()obsolete, but you should still consider whether your scenario is appropriate for a faster solver like LU decomposition or the tridiagonal solver that we have.