Question
You will be working with data on Airbnb listings in Edinburgh, Scotland. The data has the following variables: id - the listing's ID number price
You will be working with data on Airbnb listings in Edinburgh, Scotland. The data has the following variables:
id - the listing's ID number
price - the price, in GBP, for one night stay
neighbourhood - neighborhood the listing is located in
accommodates - number of people the listing accommodates
bathrooms - the listing's number of bathrooms
bedrooms - the listing's number of bedrooms
beds - the listing's number of beds (which can be different than the number of bedrooms)
review_scores_rating - the average customer rating of the property (ranges from 0 to 100)
number_of_reviews - the number of reviews included in the rating
listing_url - the URL for the listing
Part 1: Load the data and packages
Part 2: Explore the data
Explore the data using 5 functions: dim(), str(), colnames(), head() and tail(). Change the data types as appropriate.
Part 3: Handling missing values
Identify missing values
Check the data for missing values. Hint: Pipe the data frame into the function summarise_all(~sum(is.na(.))).
you should set missing values for neighbourhood to "Unknown" and delete all records with missing values for any of the other variables. Write code to accomplish this objective in the code block below.
Hint: Use ifelse to assign new values to neighbourhood and use na.omit to remove all rows with missing values for any variable.
Question 1
For this question, you will explore how the price of Airbnb listings, including how price varies across neighbourhood.
Part 1
Use dplyr to compute the mean, median, standard deviation, max, and min for price.
Part 2
Explore how price differs across neighbourhood by repeating Part 1 but grouping by neighbourhood
Part 3
Use ggplot2 to create data visualizations showing how price differs across neighbourhood. Hint: use coord_flip() to make the plot more readable.
Part 4
Comment on your findings from Part 2 and 3: what did you learn about how price differs across neighbourhood?
Question 2
For this question, you will explore how the price of Airbnb listings relates to how many people the listing accommodates.
Part 1
Create a histogram for accommodates to understand the variation in the data. Be sure to experiment with the number of bins
Part 2
Comment on your findings from Part 1 - what did you learn about the distribution of the number of people a listing can accommodate?
Part 3
Compute the correlation between price and accommodates.
Part 4
Create a visualization showing the relation between price and accommodates.
Part 5
Comment on your findings from Part 3 and 4: what did you learn about how a listing's price changes with the number of people the listing accommodates?
Question 3
For this question, you will explore how the number of people a listing accommodates is related to the listing's number of beds and its price.
Part 1
Create a visualization showing the relation between accommodates and beds. Add a reference line to the graphic using geom_abline().
Part 2
Comment on your findings from Part 1 - are there always enough beds for the number of people the listing claims it accommodates
Part 3
Create a new variable called short_on_beds that equals "Not enough beds" when accommodates > beds and "Enough beds" otherwise.
Part 4
Create a visualization to show how the relation between price and accommodates changes with the variable short_on_beds. Be sure to include a trend line.
Part 5
Comment on your findings from Part 4 - how does the relationship between accommodates and price change when there are not enough beds to sleep all guests (i.e., short_on_beds equal "Not enough beds")?
Question 4
For this question, you will explore the relation between a listing's price per night and its average rating (i.e., review_scores_rating), including how the relation changes according to the size of the listing (you will create a size variable using the accommodates variable).
Part 1
Compute the correlation between price and review_scores_rating.
Part 2
Graphic showing the relationship between price and review_scores_rating.
Part 3
Comment on your findings from Part 1 and 2 - based on your preliminary analysis, is there a relation between price and review_scores_rating?
Part 4
You remembered that there is a variable number_of_reviews that gives the number of reviews included in the listing's average rating (i.e., review_scores_rating). You wonder if your preliminary analysis would change after removing listings with fewer than 20 rating, which might have unreliable ratings.
Compute the correlation between price and review_scores_rating after removing listings with fewer than 20 ratings.
Part 5
graphic showing the relation between price and review_scores_rating after removing listings with fewer than 20 ratings.
Part 6
Comment on your findings from Part 4 and 5.
Part 7
You are not ready to draw any final conclusions between the relation between price and review_scores_rating. You wonder if the size of the listing changes the relation.
First, add a variable big that equal "Big" if accommodates > 7 and "Small" otherwise. Then, add a graphic showing how the relationship between price and review_scores_rating changes according to big. Be sure to remove listings with fewer than 20 ratings.
Part 8
Comment on your findings from Part 7.
Part 9
You wonder if your findings form Part 7 are robust. Experiment with different ways of defining a "Big" listing based on accommodates.
Part 10
Is the pattern you identified in Part 6 and 7 robust?
Question 5
What is interesting in learning or exploring in this data set? For this question, it is required to right y own question and to try to answer it by writing R code.
Part 1 Write your question
Part 2
Write code to answer the question you asked in Part 1.
Part 3
Interpret your findings from Part 2 - what is the answer to your question from Part 1?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started