Subsection 1.3.2 Importing and using Python modules
Python is a general purpose programming language. This means that Python is designed to do lots of things, not just mathematics -- you can build websites, design databases, construct and deploy machine learning models, and much more with Python.
To keep Python lightweight, not all of Python’s functionality is made available to you by default. You’ll need to import modules in order to complete many specialized tasks. For the most part, we’ll be making use of the {sympy}
(SYmbolic Mathematics in PYthon) module since it has functionality for nearly everything we encounter in an introduction to Linear Algebra. It is also worth knowing, however, the existence of the {numpy}
(NUMerical PYthon) module which handles numerical operations on vectors and matrices very well. Throughout the text we’ll use these curly-braces {}
to indicate that we are referencing a full module rather than just a Python function/method. When importing modules, we can either load the entire module (ie. import sympy
) or we can just load a single function from that module (ie. from numpy.linalg import inv
).
With {sympy}
imported, we can call and utilize functionality from that module. For example, {sympy}
includes a Matrix()
function which can be called using sympy.Matrix()
, which tells Python to go to the {sympy}
module and find the Matrix()
function. Because we need to reference the module name when calling functionality from it, Python users often alias their module names when importing in order to reduce typing. We can see this below.
Subsection 1.3.3 Working with Matrices in Python
When we encounter a matrix,
Theorem 1.2.6 tells us that there is exactly one reduced row echelon matrix that is row equivalent to it.
In fact, the uniqueness of this reduced row echelon matrix is what motivates us to define this particular form. When solving a system of linear equations using Gaussian elimination, there are other row equivalent matrices that reveal the structure of the solution space. The reduced row echelon matrix is simply a convenience as it is an agreement we make with one another to seek the same matrix.
An added benefit is that we can ask a computer program, like Python, to find reduced row echelon matrices for us. We will learn how to do this now that we have a little familiarity with Python.
First, notice that a matrix has a certain number of rows and columns. For instance, the matrix
\begin{equation*}
\left[
\begin{array}{rrrrr}
* \amp * \amp * \amp * \amp * \\
* \amp * \amp * \amp * \amp * \\
* \amp * \amp * \amp * \amp * \\
\end{array}
\right]
\end{equation*}
has three rows and five columns. We consequently refer to this as a \(3\times 5\) matrix.
We may ask Python to create the \(2\times4\) matrix
\begin{equation*}
\left[
\begin{array}{rrrr}
-1 \amp 0 \amp 2 \amp 7 \\
2 \amp 1 \amp -3 \amp -1 \\
\end{array}
\right]
\end{equation*}
by entering
When evaluated, Python will confirm that we’ve defined the matrix by writing out its rows, each inside square brackets. Notice that there the entire matrix is defined as a
list (indicated by the use of the square brackets
[]
), and the elements of the list are also lists! These are the rows of the matrix. If we think about this, what we are declaring is quite intuitive -- a matrix is a list of rows. Items passed to a Python function are called
arguments and the
sp.Matrix()
function takes just a single argument; the matrix in list form. Later, we’ll encounter functions taking multiple arguments. In these cases, we’ll separate the arguments by commas.
A very important thing to notice about our matrix definition above is that we wrote the top-left element of the matrix as -1.0
rather than -1
. This is to clarify to Python that we expect to be working within the realm of real numbers (approximated in computers using floats). While this isn’t a requirement for working with {sympy}
matrices, it is a requirement for {numpy}
. Without explicitly indicating that decimal values are permissible, Python would store the matrix as an integer array/matrix, which would damage our calculations, resuling in incorrect conclusions. This is something we’ll need to pay careful attention to! Because the difference between float and integer arrays is so important to remember when working with {numpy}
, I’ll continue to explicitly identify that our matrices contain floats even when working with {sympy}
-- this way you won’t forget if and when you switch over to using {numpy}
for something.
As a reminder, if we intend to just do numeric computatons like adding, subtracting, multiplying, finding inverses or determinants, etc. with our matrices, then the {numpy}
module has functionality that suits these needs perfectly. If, however, we want to perform manipulations like row-reduction, then we’ll need to use {sympy}
instead. Since {sympy}
includes all of the functionality we’ll need (including for those arithmetic operations), we’ll stick to the {sympy}
module throughout the text, for simplicity’s sake. However, the Python cell below shows how you can define a matrix using {numpy}
.
Regardless of whether we are using {numpy}
or {sympy}
, there is no way of specifying whether our matrix is an augmented or coefficient matrix so it will be up to us to use our objects and interpret our results appropriately.
One additional item worth pointing out is the difference between functions and methods in Python. Having completed previous math coursework, you are familiar with the notion of functions and their arguments. For example, if \(f(x) = x^2 - 5x\text{,}\) then the expression \(f(5)\) asks us to use \(5\) as an input (argument) for the function \(f\) and then to evaluate it. In Python, some types of objects have special functions, called methods, which can be applied directly to them. In this case, the format for calling the method on the object is object.method()
. For example, we defined the matrix A
above as a {sympy}
matrix. These matrices have a .rref()
method which can be applied to them in order to transform the matrix into its reduced row-echelon form. We apply that method to our matrix below to illustrate this point.
Python syntax.
Some common mistakes are
to forget the square brackets around the list of entries,
to omit an entry from the list or to add an extra one,
to forget to separate the rows, columns, and entries by commas,
to omit the parentheses around the arguments after np.array()
or sympy.Matrix()
,
spelling or capitalization errors like using a capital “A
” in np.array
or a lowercase “m
” in sympy.Matrix()
,
forgetting to include the function namespace, for example using Matrix()
instead of sp.Matrix()
, and
attempting to apply a function like a method and vice-versa.
If you see an error message, carefully proofread your input and try again.
Activity 1.3.2. Using Python to find row reduced echelon matrices.
Enter the following matrix into Python using {sympy}
and sp.Matrix()
.
\begin{equation*}
\left[
\begin{array}{rrrr}
-1 \amp -2 \amp 2 \amp -1 \\
2 \amp 4 \amp -1 \amp 5 \\
1 \amp 2 \amp 0 \amp 3
\end{array}
\right]
\end{equation*}
-
Don’t forget to import the library/libraries you want to use, and give the matrix the name \(A\) by entering
import sympy
A = sympy.Matrix([ ... ])
We may then find its reduced row echelon form by entering
A_rref = A.rref()
A_rref
A common mistake is to forget the parentheses after .rref
.
Use Python to find the reduced row echelon form of the matrix from
Item a of this activity.
Use Python to help you describe the solution space of the system of linear equations
\begin{equation*}
\begin{alignedat}{5}
-x_1 \amp \amp \amp \amp \amp {}+{} \amp 2x_4 \amp
{}={} \amp 4 \\
\amp \amp 3x_2 \amp {}+{} \amp x_3 \amp {}+{} \amp 2x_4
\amp {}={} \amp 3 \\
4x_1 \amp {}-{} \amp 3x_2 \amp \amp \amp {}+{} \amp
x_4 \amp {}={} \amp 14 \\
\amp \amp 2x_2 \amp {}+{} \amp 2x_3 \amp {}+{} \amp x_4
\amp {}={} \amp 1 \\
\end{alignedat}
\end{equation*}
-
Consider the two matrices:
\begin{equation*}
\begin{array}{rcl}
A \amp = \amp \left[
\begin{array}{rrrr}
1 \amp -2 \amp 1 \amp -3 \\
-2 \amp 4 \amp 1 \amp 1 \\
-4 \amp 8 \amp -1 \amp 7 \\
\end{array}\right] \\
B \amp = \amp \left[
\begin{array}{rrrrrr}
1 \amp -2 \amp 1 \amp -3 \amp 0 \amp 3 \\
-2 \amp 4 \amp 1 \amp 1 \amp 1 \amp -1 \\
-4 \amp 8 \amp -1 \amp 7 \amp 3 \amp 2 \\
\end{array}\right] \\
\end{array}
\end{equation*}
We say that \(B\) is an augmentation of \(A\) because it is obtained from \(A\) by adding some more columns.
Using Python, define the matrices and compare their reduced row echelon forms. What do you notice about the relationship between the two reduced row echelon forms?
-
Using the system of equations in
Item c, write the augmented matrix corresponding to the system of equations. What did you find for the reduced row echelon form of the augmented matrix?
Now write the coefficient matrix of this system of equations. What does
Item d of this activity tell you about its reduced row echelon form?
Python practices.
Here are some practices that you may find helpful when working with matrices in Python.
Break the matrix entries across lines, one for each row, for better readability by pressing Enter between rows.
#A = np.array([[1.0, 2, -1, 0],
# [-3, 0, 4, 3 ]])
A = sp.Matrix([[1.0, 2, -1, 0],
[-3, 0, 4, 3 ]])
Print your original matrix to check that you have entered it correctly. You may want to also print a dividing line to separate matrices.
#A = np.array([[ 1.0, 2],
# [2, 2]])
A = sp.Matrix([[ 1.0, 2],
[2, 2]])
print(A)
print("---------")
A_reduced = A.rref()
print(A_reduced)
The last part of the previous activity,
Item d, demonstrates something that will be helpful for us in the future. In that activity, we started with a matrix
\(A\text{,}\) which we augmented by adding some columns to obtain a matrix
\(B\text{.}\) We then noticed that the reduced row echelon form of
\(B\) is itself an augmentation of the reduced row echelon form of
\(A\text{.}\)
To illustrate, we can consider the reduced row echelon form of the augmented matrix:
\begin{equation*}
\left[
\begin{array}{ccc|c}
-2 \amp 3 \amp 0 \amp 2 \\
-1 \amp 4 \amp 1 \amp 3 \\
3 \amp 0 \amp 2 \amp 2 \\
1 \amp 5 \amp 3 \amp 7 \\
\end{array}
\right]
\sim
\left[
\begin{array}{ccc|c}
1 \amp 0 \amp 0 \amp -4 \\
0 \amp 1 \amp 0 \amp -2 \\
0 \amp 0 \amp 1 \amp 7 \\
0 \amp 0 \amp 0 \amp 0 \\
\end{array}
\right]
\end{equation*}
We can then determine the reduced row echelon form of the coefficient matrix by looking inside the augmented matrix.
\begin{equation*}
\left[
\begin{array}{ccc}
-2 \amp 3 \amp 0 \\
-1 \amp 4 \amp 1 \\
3 \amp 0 \amp 2 \\
1 \amp 5 \amp 3 \\
\end{array}
\right]
\sim
\left[
\begin{array}{ccc}
1 \amp 0 \amp 0 \\
0 \amp 1 \amp 0 \\
0 \amp 0 \amp 1 \\
0 \amp 0 \amp 0 \\
\end{array}
\right]
\end{equation*}
If we trace through the steps in the Gaussian elimination algorithm carefully, we see that this is a general principle, which we now state.
Proposition 1.3.1. Augmentation Principle.
If matrix \(B\) is an augmentation of matrix \(A\text{,}\) then the reduced row echelon form of \(B\) is an augmentation of the reduced row echelon form of \(A\text{.}\)
Subsection 1.3.4 Computational effort
At the beginning of this section, we indicated that linear algebra has become more prominent as computers have grown more powerful. Computers, however, still have limits. Let’s consider how much effort is expended when we ask to find the reduced row echelon form of a matrix. We will measure, very roughly, the effort by the number of times the algorithm requires us to multiply or add two numbers.
We will assume that our matrix has the same number of rows as columns, which we call \(n\text{.}\) We are mainly interested in the case when \(n\) is very large, which is when we need to worry about how much effort is required.
Let’s first consider the effort required for each of our row operations.
Scaling a row multiplies each of the \(n\) entries in a row by some number, which requires \(n\) operations.
Interchanging two rows requires no multiplications or additions so we won’t worry about the effort required by an interchange.
A replacement requires us to multiply each entry in a row by some number, which takes \(n\) operations, and then add the resulting entries to another row, which requires another \(n\) operations. The total number of operations is \(2n\text{.}\)
Our goal is to transform a matrix to its reduced row echelon form, which looks something like this:
\begin{equation*}
\left[
\begin{array}{cccc}
1 \amp 0 \amp \ldots \amp 0 \\
0 \amp 1 \amp \ldots \amp 0 \\
\vdots \amp \vdots \amp \ddots \amp 0 \\
0 \amp 0 \amp \ldots \amp 1 \\
\end{array}
\right]\text{.}
\end{equation*}
We roughly perform one replacement operation for every 0 entry in the reduced row echelon matrix. When \(n\) is very large, most of the \(n^2\) entries in the reduced row echelon form are 0 so we need roughly \(n^2\) replacements. Since each replacement operation requires \(2n\) operations, the number of operations resulting from the needed replacements is roughly \(n^2(2n) =
2n^3\text{.}\)
Each row is scaled roughly one time so there are roughly \(n\) scaling operations, each of which requires \(n\) operations. The number of operations due to scaling is roughly \(n^2\text{.}\)
Therefore, the total number of operations is roughly
\begin{equation*}
2n^3 + n^2\text{.}
\end{equation*}
When \(n\) is very large, the \(n^2\) term is much smaller than the \(n^3\) term. We therefore state that
This is a very rough measure of the effort required to find the reduced row echelon form; a more careful accounting shows that the number of arithmetic operations is roughly \(\frac23
n^3\text{.}\) As we have seen, some matrices require more effort than others, but the upshot of this observation is that the effort is proportional to \(n^3\text{.}\) We can think of this in the following way: If the size of the matrix grows by a factor of 10, then the effort required grows by a factor of \(10^3 = 1000\text{.}\)
While today’s computers are powerful, they cannot handle every problem we might ask of them. Eventually, we would like to be able to consider matrices that have \(n=10^{12}\) (a trillion) rows and columns. In very broad terms, the effort required to find the reduced row echelon matrix will require roughly \((10^{12})^3 = 10^{36}\) operations.
To put this into context, imagine we need to solve a linear system with a trillion equations and a trillion variables and that we have a computer that can perform a trillion, \(10^{12}\text{,}\) operations every second. Finding the reduced row echelon form would take about \(10^{16}\) years. At this time, the universe is estimated to be approximately \(10^{10}\) years old. If we started the calculation when the universe was born, we’d be about one-millionth of the way through.
This may seem like an absurd situation, but we’ll see in
Subsection 4.5.3 how we use the results of such a computation every day. Clearly, we will need some better tools to deal with
really big problems like this one.