MAT434: Competition Assignment 1
I firmly believe that the best way to learn about statistical modeling is to do statistical modeling. Every time I run a course like this one, I use a Kaggle Competition and corresponding series of Competition Assignments to motivate students to go further and learn more about statistical modeling techniques. This is a friendly competition associated with our class (and perhaps similar classes at institutions like our own). To be clear, your ultimate course grade is not tied to your position on the leaderboard. However, students in these courses often mention that the Competition Assignments are motivating and one of the most valuable parts of the class experience. I hope you’ll all find this as well!
Complete the following:
Check the
#competition-questions
channel in Slack for a direct link to our Kaggle Competition for this semester.Peruse the competition site, reading the details for this semester’s competition. On the Overview page for the Competition, click on the black
Join Competition
button.Navigate to the Data tab on the competition site. Download the
data.csv
andcomp.csv
files. You can also download thesample_submission.csv
file if you like, but you don’t need to. Move these files into the folder corresponding to your R Project / GitHub repository.- You may be prompted to accept the competition rules before you are allowed to download the data. The rules are to (i) learn new things, (ii) don’t cheat, (iii) apply yourself, and (iv) have fun.
Open RStudio and use
File -> Recent Projects
to select and open the R Project which is managing your GitHub repository.Use
File -> New File -> Quarto Document...
to create a new Quarto notebook. Fill in the fields as usual and then click the button to create the notebook.In the
setup
chunk, add the code necessary to load the{tidyverse}
and{tidymodels}
libraries. You may also want to load{kableExtra}
and{patchwork}
.Also in that code cell, use
read_csv("data.csv")
to read the data into your notebook. Don’t forget to save the result of reading the file to a named object in R, otherwise R will immediately forget the data.Remove the boilerplate that appears after your
setup
code chunk.Add a new code cell and use it to print out the
head()
of the data you’ve just read in.Use the blue arrow button to render the notebook into an HTML document.
Use the
Git
tab in the top right pane of RStudio to Pull, Commit, Push your new files to your remote repository at GitHub.
Stop by my office if you have any questions or need help.