Question
1. Import the data file nyc2clt_flights.csv as a tibble object and name it nyc2clt. Read the warning message, and use mutate() to remove the first
1. Import the data file nyc2clt_flights.csv as a tibble object and name it nyc2clt. Read the warning message, and use mutate() to remove the first column.
2. Change the data types of nyc2clt so that year, month, and day are integer vectors and carrier is a factor.
3. Add a new column named date to nyc2clt. The new column should be a string containing the departure date in the form of year-month-day (e.g., 2013-1-1).
4. First, run the following code (include it in your script!) to generate a dataset without NAs: nyc2clt_clean % filter(!is.na(air_time)) Then, create a summary table that displays four columns named carrier, count, mean_delay, and sd_delay using nyc2clt_clean. It should be a data frame (tibble) named delay_stats showing the number of flights, mean arrival delay, and standard deviation of arrival delay per carrier. Hint: Use group_by() and summarise(). You probably want to use, mean, sd(), and n() (which simply counts the number of occurrences).
5. Create a data frame named early that contains all flights that arrived early (i.e., arr_delay<0) from nyc2clt, not from nyc2clt_clean, and then arrange them in the order of arr_delay (to put the earliest on the top). The early data frame should contain only the following 6 columns: carrier, flight, air_time, dep_delay, arr_delay, and origin.
(( NOT ABLE TO UPDATE THE DATASET))
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started