August 19, 2025
File -> New File -> Quarto Document... using the menus.We work in Quarto documents, which allow a mixture of code and text in a single document
Quarto documents consist largely of the following components
---)We’ll use very simple YAML headers in this course, but you are welcome to explore more complex document customization if you like.
Our YAML header will generally look like the following:
You can actually just use the title, author, and format settings if you prefer.
format: We’ll use html by default, but you can use docx to output a Word Document or pdf to output a PDF filetheme: You can find the available document themes hereMarkdown formatting in this course is optional, but using markdown can make your documents look quite nice. The following are the most common pieces of markdown you might find use for.
this is codeWithin a Quarto Document, R code is run inside of a code chunk like the one below:
By default, R can do basic calculations
library(tidyverse) in a code chunkWe run R code by holding ctrl and hitting Enter or Return
We store items in variables using the arrow operator – for example,
 by running x <- 2 in a code chunk
x in a code chunk would print 2We can read data from a csv file using read_csv("file_path")
A data dictionary appears below:
id is a row number (unique identifier)
description is a free-form text field, describing the property (unique identifier…for our course)
city, homeType, hasSpa, and priceRange are all categorical variables
latitude, longitude, lotSizeSqFt, avgSchoolRating, and MedianStudentsPerTeacher are all numerical variables
garageSpaces, yearBuilt, numOfPatioAndPorchFeatures, numOfBathrooms, and numOfBedrooms could be treated as either numerical or categorical variables
\(\bigstar\) Work with the people next to you to come up with some questions which would be interesting to investigate with our Austin housing data.
05:00
\(\bigstar\) Work with the people next to you to come up with some questions which would be interesting to investigate with our Austin housing data.
\(\bigstar\) What questions did we come up with?
07:00
\(\bigstar\) Work with the people next to you to come up with some questions which would be interesting to investigate with our Austin housing data.
\(\bigstar\) What questions did we come up with?
\(\bigstar\) Take a few minutes to write those questions into your Day2to5_AustinHousingData.qmd file. Render your notebook to make sure everything looks the way you intended. Try some markdown formatting to improve the structure and readability of your notebook.
10:00
\(\bigstar\) Work with the people next to you to decide which of your questions are just about your sample data and which of your questions are about the entire population.
02:00
\(\bigstar\) Work with the people next to you to decide which of your questions are just about your sample data and which of your questions are about the entire population.
\(\bigstar\) Update your notebook to include two subsections – one with sample-level questions and the other with population-level questions. When finished, you should have two versions of every one of the questions you initially wrote down.
05:00
\(\bigstar\) Work with the people next to you to decide which of your questions are just about your sample data and which of your questions are about the entire population.
\(\bigstar\) Update your notebook to include two subsections – one with sample-level questions and the other with population-level questions. When finished, you should have two versions of every one of the questions you initially wrote down.
\(\bigstar\) What is the main difference in phrasing between descriptive (sample-level) questions and inferential (population-level) questions?
01:00
\(\bigstar\) If we are going to use our available data to answer inferential (population-level) questions, then what assumption(s) are we making?
01:00
\(\bigstar\) If we are going to use our available data to answer inferential (population-level) questions, then what assumption(s) are we making?
\(\bigstar\) Can both types of question (descriptive and inferential) be answered simply by calculating summary statistics from our sample data? Why or why not?
01:00
\(\bigstar\) If we are going to use our available data to answer inferential (population-level) questions, then what assumption(s) are we making?
\(\bigstar\) Can both types of question (descriptive and inferential) be answered simply by calculating summary statistics from our sample data? Why or why not?
\(\bigstar\) Without using R code just yet, describe what you would need to do in order to answer each of your descriptive questions. Add those descriptions to your notebook.
05:00
Render your notebook and make sure that the sections we’ve updated look as you intended them to.
We didn’t really use any R today, but we’ll pick up where we left off next time and actually use R to answer the descriptive questions we’ve posed.
Homework: Complete and submit the Topic 3 notebook at least 30 minutes before Monday’s class meeting. That notebook will give you many of the tools we’ll need for Monday.
Question: Moving forward, should we continue using slide decks like this one?