Question
please answer the following code AND EXPLAIN WHAT YOU DID IN ORDER TO GET GOOD FEEDBACK! I need to understand what you did in order
please answer the following code AND EXPLAIN WHAT YOU DID IN ORDER TO GET GOOD FEEDBACK! I need to understand what you did in order to understand the topic :)
DOWN BELOW IS THE REFERENCE/DATA NEEDED TO COMPLETE THESE QUESTIONS. ALL THE DATA AND INFORMATION YOU NEED IS HERE!:
the dataset you will be working with: https://1drv.ms/x/s!AtfXPbdjkmO7oJpxpD6shqkKNDERpQ?e=U1ORwc
another way to view the data file: https://drive.google.com/file/d/1e_c1Qsi648BSTGSq78aRbbfwUc5gD5fP/view?usp=sharing
All substantial questions need explanations. You do not have to explain the simple things like "how many rows are there in data", but if you make a plot of global temperature, you should explain what do you see there!
------------------------------------------------------------------------------------------------------------------------------------------------------------
2. Model AirBnB Price: Your second task is to analyze the Bangkok AirBnB listing price (variable price). It originates from Inside Airbnb (https://insideairbnb.com/get-the-data/) but use the version on canvas (airbnb-bangkok-listings.csv). You need to work with several sorts of categorical variables, including those that contain way too many too small categories. You are also asked to do log-transforms and interpret the results.
2.1 Load an clean: 1. Load data. I recommend to select only the variables you need below, bedrooms, price, and accommodates. You may return here again and change the variable selection as you need. Even better, check out the usecols argument for read_csv. Do the basic checks.
2. Do the basic data cleaning: (a) convert price to numeric. Hint: you may want to fix its string representation first! (b) remove entries with missing or invalid price, bedrooms, and other variables you need below
3. Analyze the distribution of price. Does it look like normal? Does it look like something else? Does it suggest you should make a log-transformation? Hint: consult lecture notes Section 2.1.8 Feature Transformations.
4. Convert the number of bedrooms into another variable with a limited number of categories only, such as 1, 2, 3, 4+, and use these categories in the models below. 2.2 Model: 1. Run a linear regression where you explain the listing price with number of bedrooms where bedrooms uses the categories you made above. Interpret the results, including R2. Hint: if 1-BR is the reference category, the effect for 2BR should be 1500 (but it depends on how exactly did you clean data).
2. Now repeat the process with the model where you analyze log price instead of price. Interpret the results. Which model behaves better in the sense of R2? Hint: if you cleaned the data the same way as me, you should see R2 = 0.203. For the following tasks, use either log(price) or price, depending on your answer here: 3. Finally we just add two more variables to the model: room type & accommodates. While room type only contains three values, the other two contain many different categories. Recode these as accommodates: "1", "2", "3", "4 and more" Run this model. Interpret and comment the more interesting/important results. Do not forget to mention what are the relevant reference categories, and what does R2 show.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started