QSO 370/570 - Predictive Analytics
Syllabus (Summer 2023)
Course Description: This course introduces the techniques of predictive modeling in a data-rich business
environment in order to predict future business outcomes and associated risks. It covers multivariate and other
techniques to implement predictive models for a variety of practical business applications.
Remote, Asynchronous Section: This particular summer section of Predictive Analytics is being offered in a fully
remote an asynchronous environment. We’ll make heavy use of Slack for interaction and collaboration throughout each
week.
Students in this course should expect to use Python and Jupyter Notebooks (via Google Colab). Students will work
through scripted analyses outside of class meetings and then will get their hands dirty, working with real data during
class time. Applications will vary according to student and instructor interests.
Accessing Course Notes
The Jupyter Notebooks for this course can be accessed via Google Colab. You are welcome to copy and utilize the notebooks as you see fit. In order to access the notebooks, complete the following steps.
- Navigate to colab.research.google.com.
- Click the button to Add to Drive, which will add Colab functionality to your Google Drive.
- Here’s a link to a shared folder with all of the course notes. You should make a copy of these files and save them to your own Google Drive – this will allow you to edit and save your changes.
- Navigate to the copies you made on your Google Drive. Double-click on the notebook you wish to open and then click on the orange CO logo at the top of the page to open with Colab.
- You are now running a live Jupyter Notebook from within Google Drive – you should treat these as your personal class notes; feel free to edit them as you see fit.
About Competition Assignments
In addition to traditional homework assignments in this course, students will encounter four assignments related to an In-Class Kaggle Competition. These assignments are designed to help with two things: (i) getting practice in building and assessing predictive models, and (ii) practice with technical report-writing. The Kaggle Competition will be closed such that only students in this course will be able to join, so students will be competing with their peers who are also learning about predictive modeling. The competition assignment for the Summer 2023 semester is on predicting the selling prices of used cars. A link to the competition is posted in BrightSpace.
Course Timeline and Notebooks
Below is a tentative timeline for our course. It includes Course Content and Assignments by week.
Week | Class Content | Assignments |
---|---|---|
0 | Intro to Class Video Enabling Google Colaboratory Intro to Jupyter Notebooks |
Start HW 0 |
1 | Terminology Overview Notebook (Video Overview) Python for Analytics Notebook (Videos: Part I, Part II) |
Slack Prompt (Wed) Slack Prompt (Sun) HW 0 Due (Sunday, May 7) Start Group Assign. 1 |
2 | Analytics Overview Notebook What is an Analytics Report? Kaggle Competition Overview |
Group HW 1 Due (Tues) Slack Prompt (Wed) Slack Prompt (Sun) Enroll in Kaggle Comp Start Comp. Assign. 1 |
3 | matplotlib and seaborn Plotting Tutorial Sample End-to-End Video |
Comp. Assign. 1 Due (Tues) Slack Prompt (Wed) Slack Prompt (Sun) Start Comp. Assign. 2 |
4 | Draft SOP and EDA for Zillow Data Peer Review (Slack) |
Slack Prompt (Wed) Slack Prompt (Sun) Post and Peer Review Continue Comp. Assign. 2 |
5 | Evaluating Regressors Assessing Regressors (Video) Train/Test Split Explained (Video) |
Comp. Assign 2 Due (Tues) Slack Prompt (Wed) Slack Prompt (Sun) Start Group Assign. 2 |
6 | What is Classification (Video) Building Several Classifiers (Video) Assessing Classifiers (Video) Evaluating Classifiers |
Group Assign. 2 Due (Tues) Slack Prompt (Wed) Slack Prompt (Sun) Start Group Assign. 3 |
7 | Linear Regression Overview Linear Regression, Part I Submitting Model Predictions to Kaggle |
Slack Prompt (Wed) Slack Prompt (Sun) Start Comp. Assign. 3 |
8 | Model-Building Frameworks Linear Regression, Part II |
Comp. Assign. 3 Due (Tues) Slack Prompt (Wed) Slack Prompt (Sun) Start Group Assign. 4 |
9 | Regression Trees | Group Assign. 4 Due (Tues) Slack Prompt (Wed) Slack Prompt (Sun) Start Comp. Assign. 4 |
10 | Classification Trees Classification Models and Zillow Data |
Slack Prompt (Wed) Slack Prompt (Sun) Continue Comp Assign. 4 |
11 | Logistic Regression | Comp. Assign. 4 Due (Tues) Slack Prompt (Wed) Slack Prompt (Sun) |
12+ | Advanced Module Work | Time Series, Unsupervised Learning, or Statistics and Analytics for Dissertation/Thesis |