MAT 300: Introduction and What to Expect

Dr. Gilbert

August 13, 2024

What Are We Here For?

What Are We Here For?

What Are We Here For?

What Are We Here For?

What Are We Here For?

What Are We Here For?

What Are We Here For?

What Are We Here For?

Syllabus

Major Highlights from the Syllabus: I’ll ask you to read the syllabus, but the most important items are on the following slides.

Instructor and Office Hours

  • Instructor: Dr. Adam Gilbert

    • e-mail address: a.gilbert1@snhu.edu

    • Office: Robert Frost Hall, Room 311

    • Office Hours:

      • Tuesdays 9am - 11am
      • Thursdays 2pm - 3pm
      • Fridays 10am - noon

Required Resources

First and foremost…everything is free!

Grading Scheme

Grade Item Value
Participation 10%
Homework (~3) 10%
Competition Assignments (~6) 40%
Project 40%

Explanations of Grade Items

  • Participation: Come to class and contribute actively
  • Homework: These assignments explore topics or concepts more deeply. We’ll have about three.
  • Competition Assignments: We’ll learn regression by doing regression. We have six assignments associated with a modeling competition on Door Dash delivery times.
  • Project: A final course project spanning our last three weeks together.

Competition?

  • Hosted at Kaggle
  • Closed to only students in our course
  • You’ll need a free account – link in Slack
  • Your grade is not tied to your finishing place
  • The competition aspect is friendly – please keep it that way.
  • Students say the competition is motivating and a great source of learning – one of the most valuable parts of the course.

Brightspace

  • Assignments
  • Gradebook
  • Go to the webpage for everything else

Course Webpage

I’ve built a webpage to organize our course content.

  • Syllabus

  • Tentative timeline

    • Truly tentative – can slow down, speed up, swap out topics, etc.
    • Links to daily companion slides and full notes (in HTML and QMD formats).
    • Links to assignments – you can see everything now.

What’s Class Like? (Part I)

  • Discussion Days - Most of our meetings…

    • Walk through the companion slides, and apply ideas to a different data set side-by-side.
    • Full, detailed notes are available on the webpage (use these how you like) – implement and experiment in class.
    • Make your notes “pretty” after class while reviewing what we did.

What’s Class Like? (Part II)

  • Workshop / Lab Days - Days to slow down and practice recent skills.
  • Lecture Days - Hopefully none after this first week, but I’ll lecture when you all want me to.

A Note on Approach to Class

  • I’m open to change in all of my courses.
  • If the structure isn’t working for you, let’s chat and see what changes we can make to improve your experience.
  • If you don’t want to tell me in person, leave an anonymous note under my office door.

My goal in this course is for all of you to learn as much about regression and statistical modeling as possible – we can’t achieve that if you don’t feel like you are benefiting from our class meetings.

A Road Map to Our Semester

We’ll be discussing a lot of material in MAT 300. Here is a very generic road map of what we will discuss. Starting now.

What Are We Doing?

  • Artificial Intelligence (AI)?
  • Data science?
  • Machine learning?
  • Statistical learning?

Background For Working With Data

  • What is data?
  • How to work with data: Don’t spend it all in one place!
  • How do we visualize and story-tell with data?
  • What if my data is messy? (Spoiler Alert: It will be!)

What Foundational Statistics Knowledge Do I Need?

  • Confidence Intervals: \(\left(\text{point estimate}\right) \pm 2\cdot \left(\text{standard error}\right)\)
  • Hypothesis Testing: \(p< \alpha\) means data are incompatible with a null (skeptical) hypothesis

Regression

  • What is regression?
  • Simple linear regression
  • Assessing model performance
  • Multiple linear regression
  • How do we deal with categorical predictors?
  • Higher-order terms
  • Accuracy versus interpretability
  • The bias-variance trade off

Advanced Ideas

  • What if my dataset has lots of features?
  • Can we penalize models to avoid overfitting?
  • What happens when relationships change?
  • Are there other classes of regression model?
  • Can we build models that predict a categorical outcome?

Homework (Part I)

Read Chapter 1 (pages 1 - 14) of the Introduction to Statistical Learning (ISLR) book, or watch the corresponding videos from the textbook authors.

  • Discusses three arenas where statistical learning is applied

    • Regression, Classification, and Unsupervised Learning
    • Our focus is regression, but knowing about all three will help you grasp what we are trying to do in our class

Homework (Part II)

Stop by my office (Robert Frost 311), say hi and let’s briefly chat about the following:

  • Why are you taking this course?
  • What do you hope to get out of it?
  • What contexts do you want us to pull data from in our course? (animal physiology, medicine, business/economics, football, etc.)

Next Time…

  • What is statistical learning in terms of regression?
  • Why try to build models (estimate \(f\))?
  • What are noise, reducible error, and irreducible error?
  • Why do prediction and interpretation compete?
  • What are parametric and non-parametric models?
  • How do I identify regression versus classification problems?
  • What is the difference between supervised and unsupervised learning?