Topic 5: Discrete Probability Distributions

function ok_checkbox(response, n) {
  if (!response || response.length === 0) 
    return html`<span style="color:purple">You haven't answered yet.</span>`;
  if (response.toString() === n) 
    return html`<span style="color:green">Correct ✓</span>`;
  return html`<span style="color:red">Not Yet! ✗</span>`;
}

About

This activity provides an introduction to discrete probability, including basic probability through counting outcomes, and calculating probabilities associated with outcomes of binomial experiments using the binomial distribution.

Discrete Probability Distributions

Throughout this activity you’ll be introduced to the notion of probability and will explore applications of probability and discrete random variables. After developing some intuition using foundational probability ideas, we’ll focus on binomial experiments and using the binomial distribution to find probabilities of prescribed outcomes.

Limitations

There are entire courses devoted to probability – we will only cover probability to the extent that it is necessary for use in this course. If you are interested in a more detailed treatment of probability, seek out one of the many great courses available.

Objectives

Activity Objectives: After completing this workbook you should be able to:

Define, discuss, and interpret the probability of an event as its likelihood.
Apply fundamental counting principles and the notion of independence to compute the probability associated with the occurrence of a sequence of events.
Use the definition of binomial experiments to identify scenarios to which the binomial distribution can be applied.
Apply the binomial distribution in appropriate scenarios to find probabilities associated with specified outcomes.
Given a binomial experiment, compute the expected number of successful outcomes as well as the standard deviation for number of successes.

Basic Probability

Definition of Probability (frequentist): For a given random process, the probability of an event \(A\) is the proportion of time we would observe an outcome satisfying \(A\) if the random process were repeated an infinite number of times.

Example: Given a fair coin, the probability of a flip turning up heads is \(0.5\) (or 50%). Similarly, given a fair six-sided die, the probability of a roll resulting in a number greater than four is \(1/3\) (or about 33.3%) because there are two outcomes satisfying the criteria (rolling a 5 or rolling a 6) out of the six total possible outcomes.

Try It! Now it is your turn. Try the next few problems. Be sure to note any questions you have as you work through them.

Check Your Understanding: Basic Probability I

Given one fair, six-sided die, what is the probability of rolling a three?

mutable ok_response = (response, n) => { return html`Loading...` };
viewof q1 = Inputs.radio(
  new Map([
    ["0 / 6", 1],
    ["1 / 6", 2],
    ["2 / 6", 3],
    ["3 / 6", 4],
    ["4 / 6", 5],
    ["5 / 6", 6],
    ["6 / 6", 7]
  ]),
  {value: JSON.parse(localStorage.getItem("q1_selected") ?? "null")}
);

{
  localStorage.setItem("q1_selected", JSON.stringify(q1));
  localStorage.setItem("q1_correct", "2");
  localStorage.setItem("q1_result", q1 === null ? "unattempted" : (q1 == 2 ? "correct" : "incorrect"));
}

ok_response(q1, "2");

Check Your Understanding: Basic Probability II

Given one fair, six-sided die, what is the probability of rolling a two, four, or six?

viewof q2 = Inputs.radio(
  new Map([
    ["0 / 6", 1],
    ["1 / 6", 2],
    ["2 / 6", 3],
    ["3 / 6", 4],
    ["4 / 6", 5],
    ["5 / 6", 6],
    ["6 / 6", 7]
  ]),
  {value: JSON.parse(localStorage.getItem("q2_selected") ?? "null")}
);

{
  localStorage.setItem("q2_selected", JSON.stringify(q2));
  localStorage.setItem("q2_correct", "4");
  localStorage.setItem("q2_result", q2 === null ? "unattempted" : (q2 == 4 ? "correct" : "incorrect"));
}

ok_response(q2, "4");

Check Your Understanding: Basic Probability III

Given two fair, six-sided dice, which is larger?

viewof q3 = Inputs.radio(
  new Map([
    ["The probability of rolling a total of two.", 1],
    ["The probability of rolling a total of five.", 2]
  ]),
  {value: JSON.parse(localStorage.getItem("q3_selected") ?? "null")}
);

{
  localStorage.setItem("q3_selected", JSON.stringify(q3));
  localStorage.setItem("q3_correct", "2");
  localStorage.setItem("q3_result", q3 === null ? "unattempted" : (q3 == 2 ? "correct" : "incorrect"));
}

ok_response(q3, "2");

Good work on that last set of questions. In those problems you could find the probability by counting the number of ways the desired outcome could occur and then dividing that number by the total number of outcomes possible. In the last question, there were simply more ways to roll a five (four ways to do it) than to roll a two (just one way).

What if we try doing something a bit more complicated? Say we wanted to know the probability of rolling at least a two on a single roll of a die and then flipping a “tails” on a single flip of a coin?

Probability and Independent Events

If \(A\) and \(B\) are independent events (that is, the probability that \(B\) occurs does not depend on whether or not \(A\) occurred, and vice-versa), then the probability of \(A\) and \(B\) occurring is the product of the probability of \(A\) occurring and the probability of \(B\) occurring. Mathematically, we write: \(\mathbb{P}\left[A~\text{and}~B\right] = \mathbb{P}\left[A\right]\cdot\mathbb{P}\left[B\right]\).

Check Your Understanding: Probability and Independent Events I

Given a single roll of a fair, six-sided die, what is the probability of rolling at least a two?

viewof q4 = Inputs.radio(
  new Map([
    ["0 / 6", 1],
    ["1 / 6", 2],
    ["2 / 6", 3],
    ["3 / 6", 4],
    ["4 / 6", 5],
    ["5 / 6", 6],
    ["6 / 6", 7]
  ]),
  {value: JSON.parse(localStorage.getItem("q4_selected") ?? "null")}
);

{
  localStorage.setItem("q4_selected", JSON.stringify(q4));
  localStorage.setItem("q4_correct", "6");
  localStorage.setItem("q4_result", q4 === null ? "unattempted" : (q4 == 6 ? "correct" : "incorrect"));
}

ok_response(q4, "6");

Check Your Understanding: Probability and Independent Events II

Given a single flip of a fair coin, what is the probability of the coin landing with tails facing upwards?

viewof q5 = Inputs.radio(
  new Map([
    ["0 / 2", 1],
    ["1 / 2", 2],
    ["2 / 2", 3]
  ]),
  {value: JSON.parse(localStorage.getItem("q5_selected") ?? "null")}
);

{
  localStorage.setItem("q5_selected", JSON.stringify(q5));
  localStorage.setItem("q5_correct", "2");
  localStorage.setItem("q5_result", q5 === null ? "unattempted" : (q5 == 2 ? "correct" : "incorrect"));
}

ok_response(q5, "2");

Check Your Understanding: Probability and Independent Events III

Use the code block below to compute the probability that in a single roll of a fair die and a flip of a coin we observe a roll of at least two and a flip of tails.


(5/6)*(1/2)

Good work so far. Let’s say you forgot to study for your chemistry quiz today. It is a four question multiple choice quiz with answer options \(a)\) - \(e)\) on each question. You decide that your best option is to guess randomly on each of the questions. Answer the following, using the empty code block below to carry out any necessary computations.

Hint 2 (Guessing, Part II)

The questions are independent events here. Multiply the probability associated with the outcome on each individual question together.

(___)*(___)*(___)*(___)

Hint 3 (Guessing, Part II)

The questions are independent events here. Multiply the probability associated with the outcome on each individual question together. The probability of getting any individual question correct is 0.20.

(0.20)*(0.20)*(0.20)*(0.20)

Hint 2 (Guessing, Part IV)

We don’t care about the result to question 4. We could get it right or get it wrong, and it makes no difference to whether or not our even of interest occurs.

(___)*(___)*(___)*(___)

Hint 3 (Guessing, Part IV)

We don’t care about the result to question 4. We could get it right or get it wrong, and it makes no difference to whether or not our even of interest occurs. Since, on the fourth question, our event of interest is “getting the question right or wrong”, the probability of that outcome is 100% (or 1), since no other outcome is possible.

(___)*(___)*(___)*(1)

Hint 4 (Guessing, Part IV)

If the first question that we get right is question 3, what must have been the outcome for each of the first two questions? We must have gotten both of the first two questions wrong. The probability of getting any individual question wrong was 0.80.

(0.80)*(0.80)*(___)*(1)

Check Your Understanding: Guessing on a Quiz I

For a single question, what is the probability that you get that question correct?

viewof q6 = Inputs.radio(
  new Map([
    ["0", 1],
    ["0.10", 2],
    ["0.20", 3],
    ["0.25", 4],
    ["0.50", 5],
    ["0.80", 6],
    ["1", 7]
  ]),
  {value: JSON.parse(localStorage.getItem("q6_selected") ?? "null")}
);

{
  localStorage.setItem("q6_selected", JSON.stringify(q6));
  localStorage.setItem("q6_correct", "3");
  localStorage.setItem("q6_result", q6 === null ? "unattempted" : (q6 == 3 ? "correct" : "incorrect"));
}

ok_response(q6, "3");

Check Your Understanding: Guessing on a Quiz II

What is the probability that you get every one of the questions correct?

viewof q7 = Inputs.radio(
  new Map([
    ["0", 1],
    ["0.0016", 2],
    ["0.01", 3],
    ["0.20", 4],
    ["0.4096", 5],
    ["0.5555", 6],
    ["0.998", 7],
    ["1", 8]
  ]),
  {value: JSON.parse(localStorage.getItem("q7_selected") ?? "null")}
);

{
  localStorage.setItem("q7_selected", JSON.stringify(q7));
  localStorage.setItem("q7_correct", "2");
  localStorage.setItem("q7_result", q7 === null ? "unattempted" : (q7 == 2 ? "correct" : "incorrect"));
}

ok_response(q7, "2");

Check Your Understanding: Guessing on a Quiz III

What is the probability that you get every one of the questions wrong?

viewof q8 = Inputs.radio(
  new Map([
    ["0", 1],
    ["0.0016", 2],
    ["0.01", 3],
    ["0.20", 4],
    ["0.4096", 5],
    ["0.5555", 6],
    ["0.998", 7],
    ["1", 8]
  ]),
  {value: JSON.parse(localStorage.getItem("q8_selected") ?? "null")}
);

{
  localStorage.setItem("q8_selected", JSON.stringify(q8));
  localStorage.setItem("q8_correct", "5");
  localStorage.setItem("q8_result", q8 === null ? "unattempted" : (q8 == 5 ? "correct" : "incorrect"));
}

ok_response(q8, "5");

Check Your Understanding: Guessing on a Quiz IV

What is the probability that the first one you get wrong is question three?

viewof q9 = Inputs.radio(
  new Map([
    ["0", 1],
    ["0.015", 2],
    ["0.032", 3],
    ["0.0064", 4],
    ["0.20", 5],
    ["1.2", 6]
  ]),
  {value: JSON.parse(localStorage.getItem("q9_selected") ?? "null")}
);

{
  localStorage.setItem("q9_selected", JSON.stringify(q9));
  localStorage.setItem("q9_correct", "3");
  localStorage.setItem("q9_result", q9 === null ? "unattempted" : (q9 == 3 ? "correct" : "incorrect"));
}

ok_response(q9, "3");

Check Your Understanding: Guessing on a Quiz V

What is the probability that you get exactly two questions right?

viewof q10 = Inputs.radio(
  new Map([
    ["0", 1],
    ["0.04", 2],
    ["0.08", 3],
    ["0.256", 4],
    ["0.20", 5],
    ["0.40", 6],
    ["0.64", 7],
    ["None of These", 8]
  ]),
  {value: JSON.parse(localStorage.getItem("q10_selected") ?? "null")}
);

{
  localStorage.setItem("q10_selected", JSON.stringify(q10));
  localStorage.setItem("q10_correct", "8");
  localStorage.setItem("q10_result", q10 === null ? "unattempted" : (q10 == 8 ? "correct" : "incorrect"));
}

ok_response(q10, "8");

So in the last question, none of the choices were correct – but why? There are lots of ways that we could get two of the questions right. We could get the first two right, the first and last right, the middle two right, and more! We need to account for all of these possibilities.

Binomial Experiments and the Binomial Distribution

Binomial Experiments: A binomial experiment satisfies each of the following three criteria:

There are \(n\) repeated trials.
Each trial has two possible outcomes (usually called success and failure for convenience)
The trials are independent of one another. That is, for each trial, the probability of success is \(p\) (which remains constant).

Binomial Distribution: Let \(X\) be the number of successes resulting from a binomial experiment with \(n\) trials. We can compute the following probabilities:

The probability of exactly \(k\) successes is given by \(\displaystyle{\mathbb{P}\left[X = k\right] = \binom{n}{k}\cdot p^k\left(1 - p\right)^{n-k} \approx \tt{dbinom(k, n, p)}}\)
The probability of at most \(k\) successes is given by \(\displaystyle{\mathbb{P}\left[X \leq k\right] = \sum_{i=0}^{k}{\binom{n}{i}\cdot p^i\left(1 - p\right)^{n-i}} \approx \tt{pbinom(k, n, p)}}\)

In the equations above, \(\binom{n}{k} = \frac{n!}{k!\left(n-k\right)!}\) counts the number of ways to arrange the \(k\) successes amongst the \(n\) trials. That being said, the R functionality, dbinom() and pbinom() allow us to bypass the messy formulas – but you’ll still need to know what these functions do in order to use them correctly!

Tip: Binomial Distribution

We need to use the binomial distribution to find probabilities associated with numbers of successful (or failing) outcomes in which we do not know for certain the trials on which the successes (or failures) occur.

The code block below is set up to find the probability of exactly two flips of a coin landing heads-up out of seven total flips. Edit the code block so that it finds the probability that you got exactly two of the four questions on your chemistry quiz from earlier correct. As a reminder, there were five answer options for each question and you were guessing randomly.


dbinom(2, 4, 0.2)

Good work. Now you’ll get to try a few more problems! As you work through the next set of questions, you may want to check out this example and solution. Note that in that document, I mention that drawing a simple picture for each problem will help you decide which function(s) you might use and whether you might need to make multiple computations. This is a really important strategy that will help you in developing a strategy to solve each problem.

Practice: For each of the following, consider a scenario in which a random sample of 18 students is asked (in private) whether they’ve failed to hand in at least one assignment this semester. We assume that about 34% of students fail to hand in at least one assignment.

Given a single, randomly chosen student, what is the probability that the student will have failed to hand in at least one assignment this semester?


0.34

Find the probability that exactly 7 of the 18 students have failed to hand in at least one assignment.

Hint 5

The first argument is the number of successes. If a “success” is a student having failed to hand in at least one assignment, how many successes are you interested in?

dbinom(___, ___, ___)

Hint 6

We wanted exactly seven students to have failed to hand in at least one assignment. The second argument is the total number of trials — how many trials are being “run” here?

dbinom(7, ___, ___)

Hint 7

There are 18 students total in our random sample, so there are 18 trials. The final argument is the probability of a “successful” outcome — what is the probability of a single student failing to hand in at least one assignment?

dbinom(7, 18, ___)


dbinom(7, 18, 0.34)

Find the probability that at most 9 of the 18 students have failed to hand in at least one assignment.

Hint 6

For pbinom(), the first argument is the maximum number of successes you are willing to consider. If a “success” is a student having failed to hand in at least one assignment, what is the maximum number of successes you are interested in?

pbinom(___, 18, 0.34)


pbinom(9, 18, 0.34)

Find the probability that at least 11 of the 18 students have failed to hand in at least one assignment.

Hint 7

This call calculates the probability of 0 through 11 successes. That’s not what we want. Could we start with the probability of all possible outcomes and then remove the ones we don’t want?

pbinom(11, 18, 0.34)

Hint 12

That call removes the probability of at most 11 students failing to hand in at least one assignment, leaving only the probability of at least 12 — but we wanted at least 11. Fix the first argument.

1 - pbinom(___, 18, 0.34)


1 - pbinom(10, 18, 0.34)

Find the probability that between a minimum of 6 and a maximum of 12 out of the 18 students have failed to hand in at least one assignment.

Hint 4

The result of the call below is larger than what we want because it includes all outcomes we care about plus some extra ones. Can we subtract something to remove the unwanted outcomes?

pbinom(12, 18, 0.34)

Hint 7

Since we need to remove a collection, let’s use pbinom() again. Think carefully about which events we need to remove. What number goes in the blank?

pbinom(12, 18, 0.34) - pbinom(___, 18, 0.34)

Hint 9

That leaves us with the probability of 7, 8, …, up to 12 students failing to hand in an assignment. That’s not quite right — we wanted to include 6 as well.

pbinom(12, 18, 0.34) - pbinom(6, 18, 0.34)


pbinom(12, 18, 0.34) - pbinom(5, 18, 0.34)

Don’t Memorize Approaches

In several of the previous scenarios, we needed to think about the correct “first argument” being passed to pbinom(). Don’t try to memorize when to subtract one, when to add one, when to leave the number the same as it appeared in the problem, etc. The language is what matters, and there are lots of ways to express which outcomes we are most interested in. If you insist on memorizing, you’ll become frustrated quickly.

Instead of memorizing, take the time to draw a picture to help you. Examples of what these pictures might look like can be seen in the example and solution document, which I pointed you to earlier.

The expected number of successes in a binomial experiment is sometimes denoted by \(\mathbb{E}\left[X\right]\) and can be computed as \(\mathbb{E}\left[X\right] = n\cdot p\), where \(n\) denotes the number of trials run and \(p\) denotes the probability of success on a single trial. Sometimes it is convenient to think of the expected number of successes as “the mean”. Use the code block below to compute the expected number of students who have failed to hand in at least one assignment:


18*0.34

The standard deviation in the number of successes for a binomial experiment can also be computed. The quantity \(\displaystyle{s_X = \sqrt{n\cdot p\left(1 - p\right)}}\), where \(n\) denotes the number of trials run and \(p\) denotes the probability of success on a single trial, is the standard deviation in number of successes. Use the code block below to compute the standard deviation in number of students who have failed to hand in at least one assignment from random samples of 18 students:


sqrt(18*0.34*(1 - 0.34))

Be sure to write down what questions you had as you worked through these problems and to have a teacher, colleague, or tutor help clarify things for you.

Submit

If you are part of a course with an instructor who is grading your work on these activities, please copy and submit both of the hashes below using the method your instructor has requested.

Question Hash

The hash below encodes your responses to the multiple choice questions in this activity.

function buildQuestionResults() {
  return {
    notebook: "Topic 5: Discrete Probability Distributions",
    type: "questions",
    timestamp: new Date().toISOString(),
    questions: {
      q1_basic_prob_1: {
        selected: q1,
        correct_answer: "2",
        result: q1 === null ? "unattempted" : (q1 == 2 ? "correct" : "incorrect")
      },
      q2_basic_prob_2: {
        selected: q2,
        correct_answer: "4",
        result: q2 === null ? "unattempted" : (q2 == 4 ? "correct" : "incorrect")
      },
      q3_basic_prob_3: {
        selected: q3,
        correct_answer: "2",
        result: q3 === null ? "unattempted" : (q3 == 2 ? "correct" : "incorrect")
      },
      q4_independent_events_1: {
        selected: q4,
        correct_answer: "6",
        result: q4 === null ? "unattempted" : (q4 == 6 ? "correct" : "incorrect")
      },
      q5_independent_events_2: {
        selected: q5,
        correct_answer: "2",
        result: q5 === null ? "unattempted" : (q5 == 2 ? "correct" : "incorrect")
      },
      q6_quiz_1: {
        selected: q6,
        correct_answer: "3",
        result: q6 === null ? "unattempted" : (q6 == 3 ? "correct" : "incorrect")
      },
      q7_quiz_2: {
        selected: q7,
        correct_answer: "2",
        result: q7 === null ? "unattempted" : (q7 == 2 ? "correct" : "incorrect")
      },
      q8_quiz_3: {
        selected: q8,
        correct_answer: "5",
        result: q8 === null ? "unattempted" : (q8 == 5 ? "correct" : "incorrect")
      },
      q9_quiz_4: {
        selected: q9,
        correct_answer: "3",
        result: q9 === null ? "unattempted" : (q9 == 3 ? "correct" : "incorrect")
      },
      q10_quiz_5: {
        selected: q10,
        correct_answer: "8",
        result: q10 === null ? "unattempted" : (q10 == 8 ? "correct" : "incorrect")
      }
    }
  };
}

function toBase64(str) {
  return btoa(unescape(encodeURIComponent(str)));
}

question_hash = {
  q1; q2; q3; q4; q5; q6; q7; q8; q9; q10;
  return toBase64(JSON.stringify(buildQuestionResults()));
}

html`<div style="font-family: monospace; font-size: 0.85em; background: #f5f5f5; padding: 12px; border-radius: 6px; word-break: break-all; border: 1px solid #ddd; user-select: all; cursor: pointer;" onclick="navigator.clipboard.writeText(this.innerText)">
  ${question_hash}
</div>
<p style="margin-top: 8px; font-size: 0.9em; color: #555;">
  Click the box to copy to clipboard.
</p>`

Exercise Hash

Click the button below to generate your exercise submission code. This hash encodes your work on the graded code exercises in this activity.

You must have attempted the graded exercises before clicking — clicking generates a snapshot of your current results. If you have completed the activity over multiple sessions, please go back through and hit the Run Code button on each graded exercise before generating the hash below, to ensure your most recent results are recorded.

Summary

Main Takeaways

The probability of an event \(A\) is a measure of its likelihood and is denoted \(\mathbb{P}[A]\). Every probability must be between 0 and 1.
If \(A\) and \(B\) are independent events, then \(\mathbb{P}[A \text{ and } B] = \mathbb{P}[A] \cdot \mathbb{P}[B]\).
A binomial experiment satisfies: (1) \(n\) repeated trials, (2) each trial has two possible outcomes, and (3) trials are independent with constant probability of success \(p\).
If \(X\) counts successes in a binomial experiment with \(n\) trials and success probability \(p\):
- \(\mathbb{P}[X = k] \approx\) dbinom(k, n, p) — for exactly \(k\) successes
- \(\mathbb{P}[X \leq k] \approx\) pbinom(k, n, p) — for at most \(k\) successes
- Draw a picture to help you see how to use pbinom() and/or dbinom() to calculate probabilities. These two functions above are sufficient to handle any binomial probability scenario — the challenge is identifying how to combine them.
The expected number of successes is \(\mathbb{E}[X] = n \cdot p\).
The standard deviation of number of successes is \(s_X = \sqrt{n \cdot p \cdot (1 - p)}\).

Looking Ahead

The next activity introduces the normal distribution — a continuous probability distribution that underpins much of classical statistical inference. Our focus will be on learning to compute probabilities and percentiles from this important distribution.