September 10, 2024
%>%
or |>
)We’ll interact with R via RStudio in MAT300.
We’ll be using Quarto Documents/Notebooks for everything we do.
Open RStudio
Create a new project by navigating to File -> New Project
MAT300
– you’ll include all of your notebooks for our class in this projectNow that you are in your new project space, create a new Quarto Document by navigating to File -> New File -> Quarto Document
There are three main dialects in R
Base-R
data.table
(speed)
Tidy-R / tidyverse
(readability and consistency)
Note: R dialects just refers to how we choose to write R code and which functions we prioritize – dialects can be (and often are) mixed.
install.packages("PACKAGE_NAME")
to install a package
library(PACKAGE_NAME)
to load the package
\(\bigstar\) Install the {tidyverse}
and load it in your Quarto Notebook
ASIDE: We store objects in R with the arrow operator (<-
)
Reading Data: To read data, we use a function of the form read_*()
.
{tidyverse}
(or at least {readr}
) to be loaded{readxl}
or {haven}
are common)\(\bigstar\) Read this airbnb
dataset into your Quarto Notebook from this link
I’ll post the link in Slack
Note: This AirBnB Europe data was uploaded to Kaggle by Dipesh Khemani. The original dataset can be found here.
head()
to view first six rowsglimpse()
to view dimensions and data typesskim()
from {skimr}
for much more detail\(\bigstar\) Try these functions on your airbnb
data
02:00
%>%
or |>
) make code more readable and allow chaining of functions together\(\bigstar\) Rewrite the functions you used to explore your data with pipes
03:00
filter()
to return only desired recordsselect()
to return only desired columnssummarize()
to compute summaries on a tablegroup_by()
to create groups in a tablemutate()
to create new columns or change existing ones\(\bigstar\) How might we use these functions? Write down some questions that could be answered using the functions described above. Start with a couple very simple questions and then work up to questions whose answers might be more complex to find.
05:00
filter()
to return only desired recordsselect()
to return only desired columnssummarize()
to compute summaries on a tablegroup_by()
to create groups in a tablemutate()
to create new columns or change existing ones\(\bigstar\) How might we use these functions? Write down some questions that could be answered using the functions described above. Start with a couple very simple questions and then work up to questions whose answers might be more complex to find.
\(\bigstar\) We’ll try answering some of those questions now!
10:00
penguins
data frame, and thenNote: penguins
data frame is not permanently altered here
Now the change is permanent because we’ve stored the result
Notice the use of the arrow operator (<-
)
Be careful overwriting existing objects – think about whether you:
Reminder: You have a fully complete (and documented) notebook using the mpg
data on the course webpage – note this data is different than the airbnb
data you worked with today
Use this time to continue playing with the airbnb
pricing data
Save your QMD file
Use the blue render button to convert your markdown document into a beautiful HTML document and enjoy the fruits of your labor!
Write down and answer additional interesting questions that might use functionality discussed in this slide deck – start simple and then build up to questions that might be more complex
Document your work by including text descriptions alongside the code chunks
Don’t worry if your document looks quite plain for now, we’ll have a full class meeting devoted to using markdown syntax in Quarto effectively
Data Visualization