Quiz 18 Instructions
Please complete the following questions and submit a file named Quiz18.R to Gradescope for autograding.
Remember:
Do not rename external data files or edit them in any way. In other words, don’t modify Urbana Police Incidents Data (2020-2023).csv. Your code won’t work properly on my version of that data set, if you do.
Do not use global paths in you script. Instead, use setwd() interactively in the console, but do not forget to remove or comment out this part of the code before you submit. The directory structure of your machine is not the same as the one on Gradescope’s virtual machines.
Do not destroy or overwrite any variables in your program. I check them only after I have run your entire program from start to finish.
Check to make sure you do not have any syntax errors. Code that doesn’t run will get a very bad grade.
Make sure to name your submission Quiz18.R
Don’t forget to use
library(tidyverse)
at the beginning of Quiz18.R.
Tip: before submitting, it might help to clear all the objects from your workspace, and then source your file before you submit it. This will often uncover bugs.
Data Source:
The Urbana Open Data website Urbana Police Incidents Dataset Link contains police incident data from 1988 up until February 2023, which is when the Urbana Police began using a new computer system.
In this quiz, we use data exclusively from 2020 to 2023 Urbana Police Incidents Data (2020-2023).csv and concentrate only on the columns related to date and time.
Question 1
- [1 pt] Read in the CSV data as tibble
incidents
and select the columns fromDATE.OCCURRED
toTIME.ARRIVED
only (8 columns).
- [2 pts] Select the appropriate
lubridate
function to convert the variablesDATE.OCCURRED
,DATE.REPORTED
, andDATE.ARRIVED
from character to date type.
- [2 pts] Parse the day of the month the incidents occurred. Save the result as a vector named
DAY
.
Question 2
[3 pts] Please
- combine the
DATE.OCCURRED
andTIME.OCCURRED
asDATETIME.OCCURRED
; - combine the
DATE.REPORTED
andTIME.REPORTED
asDATETIME.REPORTED
; - combine the
DATE.ARRIVED
andTIME.ARRIVED
asDATETIME.ARRIVED
.
[Hint: We can combine the date “2022-01-04” and time “16:00:00” in the following way:
ymd_hms(paste("2022-01-04", "16:00:00"), tz = "America/Chicago")
]
Question 3
- [2 pts] Identify the top 5 incidents with the largest time difference between the time reported and the time occurred (i.e.
DATETIME.REPORTED - DATETIME.OCCURRED
ordifftime(DATETIME.REPORTED, DATETIME.OCCURRED)
). Please make sure to useas.duration()
to convert thedifftime()
object to aDuration
object for easier interpretation. Name this difference variablereport_occur_diff
, and save the subset as a tibble namedtop5_rodiff
.
[Hint: You may use arrange()
and head()
or slice_head()
functions.]
- [2 pts] Identify the top 5 incidents with the largest time difference between the time arrived and the time reported (i.e.
DATETIME.ARRIVED - DATETIME.REPORTED
ordifftime(DATETIME.ARRIVED, DATETIME.REPORTED)
). Name this difference variablearrive_report_diff
, and save the subset as a tibble namedtop5_ardiff
.