Complexity and Big-O Notation

Section 4.1 Complexity and Big-O Notation

In this section we apply our knowledge of looping and counting to develop the notion of big-O notation. We’ll look at routines which accept an array as input and explore the growth in the number of operations run during execution of the routine as the size of the input array grows. In particular, we’ll see applications which are \(O\left(1\right)\text{,}\) \(O\left(n\right)\text{,}\) \(O\left(n^2\right)\text{,}\) and \(O\left(n^3\right)\text{.}\)

We’ll consider only run-time complexity in this section, though students should also know that space-complexity is a competing concern when writing code to solve problems. Students in computing disciplines will encounter both run-time- and space-complexity throughout their studies.

About. In this section, it is necessary to provide definitions of example routines. The routines are not written in any particular language -- in fact, most won’t work in any language as you see them written here. Instead, the routines are provided via some pseudocode -- something about halfway between human preferrable and computer readable. Since we are working with arrays, it is useful to have some convention about how to access individual array elements. Throughout this section, we use myArray[0] to access the element in the left-most slot of the myArray object. We use myArray[i] to access the element in the \(i^{th}\) slot of myArray, remembering that the left-most item occupies the \(0^{th}\) slot of myArray.

Indexing from 0 may seem awkward to those of you who are just beginning your computing careers, but this is the case for many of the most popular computing languages. For this reason, we adopt 0-based indexing here.

Complete the following checkpoint exercises to gain some familiarity with analyzing operations within a function. For reasons that will become clear later in the section, you’ll be asked to ignore any "operations" required to initialize variables, initialize a loop, or return a result back to the computing environment. Additionally, you can consider lines combining arithmetic operations and updating a stored value to be a single operation. That is, x = x + i can be considered a single operation in the checkpoint items below. The reason we can do this will be addressed when big-O notation is introduced later in the section.

Checkpoint 4.1.1. Operations to Return First Element.

Answer the following.

Checkpoint 4.1.2. Operations to Sum First Five Elements.

Answer the following.

Checkpoint 4.1.3. Operations to Sum First Elements.

Answer the following.

Checkpoint 4.1.4. Operations to Sum All Elements.

Answer the following.

Note that in the first three interactive problems above, the size of the input array didn’t matter. That is, number of operations required to arrive at the value to be returned did not depend on the length of the input array. If you didn’t notice that on your own, try regenerating new versions of the embedded examples and working through the examples again. In the final example, however, the number of operations required to complete before arriving at the value to be returned did depend on the length of the input array. Input arrays with more elements require more calculations, and so it should be expected that, as this sum_array_values() routine is applied to larger and larger arrays, it takes longer to run. How much longer run-time should we expect if the length of the input array were to, let’s say, double in length? Should we expect the routine to take approximately twice as long? More time? Less than that? How can we tell? This is the job of big-O notation.

Consider a routine which takes an array as input and then produces some output. The run-time complexity of the algorithm describes the order of the number of operations to be done in transforming the input array into the output, as a function of the length of the input array. Traditionally, the run-time complexity has been written using big-O notation, which looks like \(O\left(f\left(n\right)\right)\text{,}\) where \(f\left(n\right)\) is a function of the length of the input array. We’ll see some examples below.

Example 4.1.5. Run-Time Complexity 1.

Consider the routine below, which takes in an array as input and returns the value of the first element of the array. Describe the run-time of the routine as a function of the size of the input array.

def return_first(array):
  return array[0]

Solution.

Note that the routine immediately returns the first element of the array, regardless of the size of the input array. There is a single operation being done here, no matter what non-empty array is supplied as the input. This means that the routine in question has constant runtime. We say that the routine is \(O\left(1\right)\text{.}\)

Example 4.1.6. Run-Time Complexity 2.

Consider the routine below, which takes in an array as input and returns the sum of the first two array elements. Describe the run-time of the routine as a function of the size of the input array.

def sum_first_two(array):
  a = array[0]
  b = array[1]
  total = a + b

  return total

Solution.

Note that the routine uses three operations in order to transform the input array into the value which is returned. First, the item in array[0] is stored in the variable \(a\text{.}\) Next, the intem in array[1] is stored in the variable \(b\text{.}\) Finally, the sum, \(a + b\) is stored in the total variable, which is the value to be returned. Three operations are being completed here, so we might say that this routine has run-time complexity \(O\left(3\right)\text{.}\) The attentive reader may ask why this routine is not \(O\left(4\right)\) because of the return statement. In short, we shouldn’t waste time making a distinction because, regardless of whether we consider returning a value to be an operation or not, this routine has constant run-time. That is, this routine is also \(O\left(1\right)\text{.}\)

We saw in the previous example that there is no distinction made between \(O\left(1\right)\) and \(O\left(3\right)\text{,}\) or even \(O\left(10^6\right)\text{,}\) for that matter. All of these denote constant run-time, and so their big-O description of complexity is \(O\left(1\right)\text{.}\) We say that such routines have constant run-time, since the time required to run the routine does not change as the size of the input array changes.

Let’s see how the run-time complexity grows for some more interesting routines in the examples below.

Example 4.1.7. Run-Time Complexity 3.

Consider the routine below, which takes in an array as input and returns the product of all the array elements. Describe the run-time of the routine as a function of the size of the input array.

def array_product(array):
  myProd = 1
  for element in array:
    myProd = myProd*element

  return myProd

Solution.

The presence of the loop here has the potential to result in run times that are related to the size of the input array. There is a single operation being done within the body of the for loop. That is, the body of the loop has \(O\left(1\right)\) run-time. The loop will run once for each element in the input array. Let’s label the number of elements in the input array by \(n\text{.}\) We’ll ignore accounting for the "operations" of setting up the loop and returning the result, but there is the operation required to initialize the myProd variable which we will account for.

Given the analysis above, the total run-time for this routine is \(n\cdot O\left(1\right) + O\left(1\right)\text{.}\) When analyzing the run-time complexity of a routine, we are looking to identify the bottlenecks in our code. This means that we care only about the part of the code whose run-time grows the fastest. For this reason, we identify the highest-order term in our analysis and throw away all of the others (since for very large input arrays, the lower-order terms will have diminished impact on the observed run-time). This leaves us with

\begin{align*} n\cdot O\left(1\right) + O\left(1\right) \amp = O\left(n\right) + O\left(1\right)\\ \amp = O\left(n\right) \end{align*}

Example 4.1.8. Run-Time Complexity 4.

Consider the routine below, which takes in an array as input and returns a value computed using the values from the input array. Describe the run-time of the routine as a function of the size of the input array.

def array_product_difference(array):
  myValue = 1
  for element in array:
    myValue = myValue - element
    myValue = myValue * element
        
  return myValue

Solution.

Again, the presence of the loop here has the potential to result in run times that are related to the size of the input array. There are two operations being done within the body of the for loop. That is, the body of the loop has \(O\left(1\right) + O\left(1\right)\) run-time. The loop will run once for each element in the input array. Again, we’ll label the number of elements in the input array by \(n\text{.}\) As usual, we’ll ignore accounting for the "operations" of setting up the loop and returning the result, but we account for the operation to initialize the myValue variable. The run-time is then

\begin{align*} n\left(\cdot O\left(1\right) + O\left(1\right)\right) + O\left(1\right) \amp = n\cdot O\left(2\right) + O\left(1\right)\\ \amp = O\left(2n\right) + O\left(1\right)\\ \amp = O\left(2n\right)\\ \amp = O\left(n\right) \end{align*}

In the last line from the string of equalities above, we dropped the constant \(2\) from \(O\left(2n\right)\text{.}\) This wasn’t simply a typo, and the rationale for doing so is similar to the rationale for omitting any constant run-time elements from the overall description of the run-time of the entire routine.

NOTE: Need to decide whether setting up a loop or returning values should be counted as an "operation" in determining the run-time complexities -- I’ll reach out to the CS department about the choice they currently make in Analysis of Algorithms.

Example 4.1.9. Run-Time Complexity 5.

Consider the routine below, which takes in an array as input and returns a value of TRUE if there exist elements a, b, and c in the array such that a*b = c, and returns FALSE otherwise. Describe the run-time of the routine as a function of the size of the input array.

def detect_product(array):
  for element_a in array:
    for element_b in array:
      for element_c in array:
        if element_a*element_b == element_c:
          return TRUE
        
  return FALSE

Solution.

In the routine defined above, we should note that once a program reaches a return statement, the corresponding value is returned and the routine terminates. When we are analyzing run-time complexity, we always consider a "worst-case" scenario for run-time. Given the routine above, a worst-case is that no such trio of array elements are found. As a result, the routine must run completely over the entire array before returning FALSE. In such a scenario, we have

\begin{align*} n\left(n\left(n\left(O\left(1\right)\right)\right)\right) \amp = n^3\cdot O\left(1\right)\\ \amp = O\left(n^3\right) \end{align*}

This routine, as coded here, has a run-time of \(O\left(n^3\right)\text{.}\) This means that if we double the length of the input array, say from \(k\) to \(2k\text{,}\) the routine is expected to take about \(2^3 = 8\times\) longer to run.

To recap from the examples above, we use big-O notation to describe the run-time of a routine as a function of the length of its input array. Big-O notation allows us to answer the question "How would our run-time change if the size of our input array was doubled?". If the length of an input array was doubled, we have the following expected changes in run-times.

For an \(O\left(1\right)\) routine, the expected run time does not change.
For an \(O\left(n\right)\) routine, the expected run time is approximately doubled.
For an \(O\left(n^2\right)\) routine, the expected run time is approximately four times longer.
For an \(O\left(n^3\right)\) routine, the expected run time is approximately eight times longer.

There are routines which have run-time complexities which are between constant and linear. For example, some routines have \(O\left(\log\left(n\right)\right)\) run-time. We’ll see some later in our course. There are also routines which don’t have polynomial run-time. They may be exponential -- for example \(O\left(2^n\right)\) -- or factorial, \(O\left(n!\right)\text{.}\) In general, we desire routines which are more efficient -- that is, routines with lower-order run-time complexities -- because they are faster. However, there is sometimes a tradeoff between run-time complexity and space-complexity. What good is a fast algorithm if requires more memory than you have access to?

Complete the examples below to verify your grasp of algorithmic complexity and big-O notation. Please note that the algorithms are written in pseudocode, and the algorithms don’t necessarily claim to solve their corresponding challenge in the most efficient manner possible. Your answers should describe the algorithmic complexity of the routines as written in the sample problems. Interested readers may try writing working versions of these routines in a language of their choice, and are invited to think about optimizing the algorithms.

Checkpoint 4.1.10. Big-O Analysis 1.

Answer the following.

Checkpoint 4.1.11. Big-O Analysis 2.

Answer the following.

Checkpoint 4.1.12. Big-O Analysis 3.

Answer the following.

Checkpoint 4.1.13. Big-O Analysis 4.

Answer the following.

Checkpoint 4.1.14. Big-O Analysis 5.

Answer the following. This one is challenging.

Prev Top Next