Question
In this project you will be working with R, SQL, and Python in the same document. We will use the data sets airlines and flights
In this project you will be working with R, SQL, and Python in the same document. We will use the data sets airlines and flights from the package nycflights13.
Knit the empty file (to make sure everything is working)
Consistently knit the file each time you answer a question
In R, install the packages nycflights13, sqldf, tydiverse and load all data sets and packages. Take your time to understand the data sets.
```{r message=FALSE, warning=FALSE}
# Install code here - comment the code when you finish installation # Load all packages here # Load the data here
```
Question 1: List the name of airlines where the destination is ATL airport with their average arrival delays and sort them from the smallest to largest average arrival delays. Use data frames flights and airlines.
We shall solve this question using R, SQL, and Python.
R solution
You can use the R command ```inner_join()```to join two tables.
```
# Code here
```
SQL solution
Write your SQL query in the function sqldf(). For exmaple, sqldf("select * from relig_income") list the data frame relig_income.
```{r, warning=FALSE, message=FALSE }
# code here
```
Python solution
Use df.join() from pandas.
Use [df.join()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.join.html) from pandas.
```{python} # load python libraries # load data # code here ```
Question 2: Plot the boxplot of the departure delays vs the name of airlines where the destination is ATL airport. Solve this question using R and Python.
R solution
Use ggplot
```{r, warning=FALSE, message=FALSE }
# code here
```
Python solution
You can use pandas.boxplot() or seaborn.boxplot(). You can either import data from the R package into Python or use the data Python package nycflights13
You can use [pandas.boxplot()](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.boxplot.html) or [seaborn.boxplot()](https://seaborn.pydata.org/generated/seaborn.boxplot.html). You can either import data from the R package into Python or use the data Python package [nycflights13](https://pypi.org/project/nycflights13/)
```{python}
# import python libraries # data # Boxplot
```
Question 3: For each airlines, 1) find the month where the average departure delay time is the highest in the year. 2) Make a visualization to show the results. Solve this question using your preferred language R or Python.
```{} # Do not foget to specify which language are you using. # load libraries # load data # code here to find the months # code here to make the visualization ```
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started