MAT 370: Row Pivoting

Dr. Gilbert

December 28, 2025

A Recap

At this point we’ve implemented three solvers for solving linear systems.

  • The GaussElim() function is general purpose, but is slow because the row-reduction algorithm is \(O\left(n^3\right)\).

  • The LU solvers we built are an improvement over Gaussian elimination, if used correctly. The LUdecomp() function will decompose a coefficient matrix into the product between a lower- and upper-triangular matrix, and then LUsolve() will solve the corresponding \(LU\vec{x} = \vec{b}\) system.

    • LU decomposition is an improvement because it only does row reduction once and all subsequent systems can be solved with the use of LUsolve().
    • The initial run of our LU decomposition method is still \(O\left(n^3\right)\), but all subsequent runs are \(O\left(n^2\right)\).
  • Our tridiagonal solver implements LU decomposition in a special case where the coefficient matrix is tridiagonal, resulting in a significant space savings.

A Performance Comparison

Example: We’ll solve the matrix equations \(A\vec{x} = \vec{b_i}\) for a tridiagonal matrix \(A\) and several constraint vectors \(\vec{b_i}\).

\[A = \begin{bmatrix} 2 & -1 & 0 & 0 & 0\\ -1 & 2 & -1 & 0 & 0\\ 0 & -1 & 1 & 3 & 0\\ 0 & 0 & -4 & 3 & 1\\ 0 & 0 & 0 & -2 & 4\end{bmatrix}~~~~\vec{b_1} = \begin{bmatrix} 1\\ -8\\ 5\\ 13\\ 4\end{bmatrix},~~\vec{b_2} = \begin{bmatrix} -3\\ 7\\ 17\\ 11\\ -4\end{bmatrix},~~\vec{b_3} = \begin{bmatrix} -11\\ 13\\ 25\\ -35\\ 12\end{bmatrix},~~\vec{b_4} = \begin{bmatrix} -23\\ -17\\ 11\\ 42\\ 13\end{bmatrix}\]

Gaussian Elimination
tic = time.time()
A = make_A()
x1 = GaussElim(A, b1)
A = make_A()
x2 = GaussElim(A, b2)
A = make_A()
x3 = GaussElim(A, b3)
A = make_A()
x4 = GaussElim(A, b4)
toc = time.time()

print("That took ", toc - tic, " seconds.")
That took  0.008612871170043945  seconds.
LU Decomposition
tic = time.time()
A = make_A()
LU = LUdecomp(A)
x1 = LUsolve(A, b1)
x2 = LUsolve(A, b2)
x3 = LUsolve(A, b3)
x4 = LUsolve(A, b4)
toc = time.time()

print("That took ", toc - tic, " seconds.")
That took  0.007344722747802734  seconds.
Tridiagonal Solver
tic = time.time()
c = np.array([-1.0, -1, -4, -2])
d = np.array([2.0, 2, 1, 3, 4])
e = np.array([-1.0, -1, 3, 1])
lam, d, e = DoolittleLUdecomp3(c, d, e)
x1 = Doolittle3solver(lam, d, e, b1)
x2 = Doolittle3solver(lam, d, e, b2)
x3 = Doolittle3solver(lam, d, e, b3)
x4 = Doolittle3solver(lam, d, e, b4)
toc = time.time()

print("That took ", toc - tic, " seconds.")
That took  0.008213043212890625  seconds.

A Significant Shortcoming

Example: Solve the linear system \(\left\{\begin{array}{lcl} -x_2 + x_3 & = & 0\\ -x_1 + 2x_2 - x_3 & = & 0\\ 2x_1 - x_2 & = & 1\end{array}\right.\)

A Significant Shortcoming

Because of rounding errors, the order of the rows in a coefficient matrix can greatly impact performance and accuracy of results produced by a numerical solver.

The system

\[\left\{\begin{array}{lcl} -x_2 + x_3 & = & 0\\ -x_1 + 2x_2 - x_3 & = & 0\\ 2x_1 - x_2 & = & 1\end{array}\right.\]

has the augmented coefficient matrix

\[\left[\begin{array}{ccc|c} 0 & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\\\end{array}\right]\]

This augmented coefficient matrix has the rows in the “wrong order”.

Our methods all fail because there is a \(0\) in the pivot position.

In physically looking at the system, we notice that a row swap is advantageous, but how do we trigger a row-swap algorithmically? Why is a row-swap beneficial here?

When to Pivot

Sometimes it is imperative that we reorder the rows in a system during the elimination phase – this is called row pivoting.

In general, we’ll want to do this when the pivot element is \(0\) or when it is very small compared to the other elements in the pivot row.

Consider the example augmented coefficient matrix to the right with \(\varepsilon\) representing a very small number.

\[\begin{align*}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right] \end{align*}\]

When to Pivot

Sometimes it is imperative that we reorder the rows in a system during the elimination phase – this is called row pivoting.

In general, we’ll want to do this when the pivot element is \(0\) or when it is very small compared to the other elements in the pivot row.

Consider the example augmented coefficient matrix to the right with \(\varepsilon\) representing a very small number.

\[\begin{align*}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right] &\stackrel{R_2 \leftarrow R_2 - (-1/\varepsilon)R_1}{\to}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & 2 - (1/\varepsilon) & -1 + (1/\varepsilon) & 0\\ 2 & -1 & 0 & 1\end{array}\right] \end{align*}\]

When to Pivot

Sometimes it is imperative that we reorder the rows in a system during the elimination phase – this is called row pivoting.

In general, we’ll want to do this when the pivot element is \(0\) or when it is very small compared to the other elements in the pivot row.

Consider the example augmented coefficient matrix to the right with \(\varepsilon\) representing a very small number.

\[\begin{align*}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right] &\stackrel{R_2 \leftarrow R_2 - (-1/\varepsilon)R_1}{\to}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & 2 - (1/\varepsilon) & -1 + (1/\varepsilon) & 0\\ 2 & -1 & 0 & 1\end{array}\right]\\ &\stackrel{R_3 \leftarrow R_3 - (2/\varepsilon)R_1}{\to}\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & 2 - (1/\varepsilon) & -1 + (1/\varepsilon) & 0\\ 0 & -1 + (2/\varepsilon) & -2/\varepsilon & 1\end{array}\right] \end{align*}\]

Notice though that since \(\varepsilon\) is very small, \(1/\varepsilon\) is very large.

We may be in danger of entering roundoff error territory. The resulting augmented coefficient may be stored as

\[\left[\begin{array}{ccc|c} \varepsilon & -1 & 1 & 0\\ 0 & - 1/\varepsilon & 1/\varepsilon & 0\\ 0 & 2/\varepsilon & -2/\varepsilon & 1\end{array}\right]\]

In this case, \(R_2\) and \(R_3\) contradict eachother and we would identify no solutions, but the solution should exist.

Formalizing When to Pivot

We’ve identified some scenarios where row-pivoting is beneficial – when our pivot is \(0\) or when it is “very small”.

What constitutes “very small” is subjective, so we’ll need a way to formalize what we mean so that we can write it into our algorithm. We need a proxy for “safe pivots” that a computer can detect.

An \(n\times n\) matrix \(A\) is said to be diagonally dominant if each diagonal element is greater than the sum of the absolute values of the other elements in its row.

\[\left|A_{ii}\right| > \sum_{\substack{j = 1\\ j\neq i}}^{n}{\left|A_{ij}\right|}\]

For example, the matrix \(\left[\begin{array}{ccc} -2 & 4 & -1\\ 1 & -1 & 3\\ 4 & -2 & 1\end{array}\right]\) is not diagonally dominant, but with two row pivots we obtain \(\left[\begin{array}{ccc} 4 & -2 & 1\\ -2 & 4 & -1\\ 1 & -1 & 3 \end{array}\right]\), which is diagonally dominant.

If \(A\) is diagonally dominant, then a numerical solution for \(A\vec{x} = \vec{b}\) does not benefit by row pivoting. Because of this, we’ll implement a strategy that reorders rows of \(A\) to achieve near-diagonal-dominance.

An Example for Illustration

As usual when we are about to introduce a new algorithm, we’ll use an example to help illuminate the steps involved. We’ll use the example below.

Example: Use Gaussian Elimination with Scaled Row pivoting to solve the system \(A\vec{x} = \vec{b}\) where \(A = \left[\begin{array}{ccc} 2 & -2 & 6\\ -2 & 4 & 3\\ -1 & 8 & 4\end{array}\right]\) and \(\vec{b} = \left[\begin{array}{c} 16\\ 0\\ -1\end{array}\right]\).

Scale Factors

Consider an approach to solving the system \(A\vec{x} = \vec{b}\) using Gaussian Elimination with row-pivoting.

That is, we row-pivot \(A\) during elimination so that the pivot element is as large as possible with respect to the other elements in the pivot row.

In order to do this, let’s define an array \(s\) as follows:

\[s_i = \max_{j}{\left|A_{ij}\right|}~~\text{for}~~i,j\in [n]\]

We call \(s_i\) the scale factor for \(R_i\), and it contains the absolute value of the largest element in the \(i^{th}\) row of \(A\).

We can write a simple routine to populate \(s\).

#populate scale-factor array
n = A.shape[0]
s = np.zeros(n)
for i in range(n):
  s[i] = max(abs(A[i,:]))

Triggering Row Pivoting

Using s, our array of scale factors, we can compute the relative size of any matrix element to the largest element in its row.

That is, \(\displaystyle{r_{ij} = \frac{\left|A_{ij}\right|}{s_i}}\).

Suppose we are at the stage of the elimination phase where the \(k^{th}\) row has become the pivot row. The augmented coefficient matrix is as seen below.

We don’t immediately use \(A_{kk}\) as the next pivot element.

Instead, we search the rows below \(R_k\) for a “better” pivot in column \(k\).

\[\left[\begin{array}{ccccccc|c} A_{11} & A_{12} & A_{13} & \cdots & \cdots & \cdots & A_{1n} & b_1\\ 0 & A_{22} & A_{23} & \cdots & \cdots & \cdots & A_{2n} & b_2\\ 0 & 0 & A_{33} & \cdots & \cdots & \cdots & A_{3n} & b_3\\ \vdots & \vdots & \vdots & \ddots & \cdots & \vdots & \vdots\\ \hline 0 & 0 & \cdots & 0 & A_{kk} & \cdots & A_{kn} & b_k\\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & \cdots & 0 & A_{nk} & \cdots & A_{nn} & b_n\\ \end{array}\right]\]

The best choice is the entry with the largest relative size – that is, \(A_{pk}\) such that \(\displaystyle{r_{pk} = \max_{j}{(r_{jk})}}\) for \(j\geq k\).

If \(p \neq k\), then we execute a row exchange both in \(A\) and also exchange the corresponding elements in \(s\).

The code to the right uses a swapRows() function, which we’ll also write when we build our GaussPivot() function next.

This can be done as follows:

#row-pivot
for k in range(0, n-1):
  p = argmax(abs(A[k:n, k])/s[k:n]) + k
  if abs(A[p, k]) < tol:
    print("Matrix is Singular")
    return None

  if p != k:
    swapRows(b, k, p)
    swapRows(s, k, p)
    swapRows(A, k, p)

Implementing Gaussian Elimination with Row Pivoting

def swapRows(v, i, j):
  if len(v.shape) == 1:
    v[i], v[j] = v[j], v[i]
  else:
    v[[i, j], :] = v[[j, i], :]

def scaleFactors(A):
  n = A.shape[0]
  s = np.zeros(n)
  for i in range(n):
    s[i] = max(np.abs(A[i, :]))

  return s

def GaussPivot(A, b, tol = 1.0e-12):
  n = len(b)
  s = scaleFactors(A)

  for k in range(0, n-1):
    p = np.argmax(np.abs(A[k:n, k])/s[k:n]) + k
    if(abs(A[p, k]) < tol):
      print("Matrix is Singular")
      return None

    #Row Pivot if necessary
    if p != k:
      swapRows(b, k, p)
      swapRows(s, k, p)
      swapRows(A, k, p)

    #Elimination
    for i in range(k+1, n):
      if A[i, k] != 0.0:
        lam = A[i, k]/A[k, k]
        A[i, (k+1):n] = A[i, (k+1):n] - lam*A[k, (k+1):n]
        b[i] = b[i] - lam*b[k]

  if abs(A[n-1, n-1]) < tol:
    print("Matrix is Singular")
    return None

  #back substitution
  b[n-1] = b[n-1]/A[n-1, n-1]
  for k in range(n-2, -1, -1):
    b[k] = (b[k] - np.dot(A[k, (k+1):n], b[(k+1):n]))/A[k, k]

  return b

Comments on Implementation

Again, our implementation involves two helper functions.

  • The swapRows() function allows for swapping the rows of a matrix or elements of a vector.
  • The scaleFactors() function computes the array of scale factors for the matrix.

The GaussPivot() function takes in the coefficient matrix \(A\), the constraint vector \(\vec{b}\), and includes a preset but overridable tolerance.

The tolerance determines scenarios when the function should terminate due to rounding error concerns rather than reporting an untrustworthy numerical solution.

This new GaussPivot() function is the one which is most widely applicable, but it is not the fastest or most space-efficient.

It makes GaussElim() obsolete, but you should still consider whether your scenario is appropriate for a faster solver like LU decomposition or the tridiagonal solver that we have.

Example

Example: Our original GaussElim() procedure couldn’t correctly solve the system below because of the order of the rows. Show that our new GaussPivot() method does solve the system correctly.

\[\left[\begin{array}{ccc|c} 0 & -1 & 1 & 0\\ -1 & 2 & -1 & 0\\ 2 & -1 & 0 & 1\end{array}\right]\]

A = np.array([
  [0.0, -1, 1],
  [-1, 2, -1],
  [2, -1, 0]
])
b = np.array([0.0, 0, 1])

GaussPivot(A, b)
array([1., 1., 1.])

Our improved method easily handled this case by performing the necessary row swap.

An Additional Example

Example: Use Gaussian Elimination with Scaled Row pivoting to solve the system \(A\vec{x} = \vec{b}\) where \(A = \left[\begin{array}{ccc} 2 & -2 & 6\\ -2 & 4 & 3\\ -1 & 8 & 4\end{array}\right]\) and \(\vec{b} = \left[\begin{array}{c} 16\\ 0\\ -1\end{array}\right]\).

Pivoting Isn’t Always Best

There are some drawbacks to pivoting. In particular,

  • those rowSwap() operations can be expensive (run time), and
  • swapping rows destroys matrix symmetry and any banded structure

Fortunately, in many applications where banded or symmetric matrices appear, those matrices are nearly diagonally dominant and so there is not often a benefit to pivoting anyway.

While not hard rules, banded matrices, symmetric matrices, and positive definite matrices seldom benefit from row pivoting.

Note that it is, however, possible to construct banded, symmetric, or positive definite matrices that do benefit from pivoting – they just don’t stem from real engineering problems.

Summary

  • We saw how to use scaled row pivoting to ensure that we don’t end up in a situation where a pivot element is \(0\) or nearly \(0\).
  • If a pivot element is \(0\), our Gaussian Elimination procedure will fail.
  • Cases where a pivot element is nearly \(0\) invite roundoff errors that can lead to incorrect solutions or a routine suggesting that no solution exists.
  • We saw that we can use scaling to guide row pivots to make a coefficient matrix as close to diagonally dominant as possible, which will avoid the problems outlined here.
  • We implemented an algorithm to identify when a row pivot is worthwhile, to execute that row swap, and then to carry on with Gaussian Elimination.

Next Time



We’ll leave linear algebra and visit techniques for curve fitting and function approximation from data.