Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

python and statistics!! please answer the following code AND EXPLAIN WHAT YOU DID IN ORDER TO GET GOOD FEEDBACK! I need to understand what you

python and statistics!!

please answer the following code AND EXPLAIN WHAT YOU DID IN ORDER TO GET GOOD FEEDBACK! I need to understand what you did in order to understand the topic :)

the dataset you will be working with: https://1drv.ms/x/s!AtfXPbdjkmO7oJpxpD6shqkKNDERpQ?e=U1ORwc

another way to view the data file: https://drive.google.com/file/d/1e_c1Qsi648BSTGSq78aRbbfwUc5gD5fP/view?usp=sharing

All substantial questions need explanations. You do not have to explain the simple things like "how many rows are there in data", but if you make a plot of global temperature, you should explain what do you see there!

------------------------------------------------------------------------------------------------------------------------------------------------------------

2. Model AirBnB Price: Your second task is to analyze the Bangkok AirBnB listing price (variable price). It originates from Inside Airbnb (https://insideairbnb.com/get-the-data/) but use the version on canvas (airbnb-bangkok-listings.csv). You need to work with several sorts of categorical variables, including those that contain way too many too small categories. You are also asked to do log-transforms and interpret the results.

2.1 Load an clean: 1. Load data. I recommend to select only the variables you need below, bedrooms, price, and accommodates. You may return here again and change the variable selection as you need. Even better, check out the usecols argument for read_csv. Do the basic checks.

2. Do the basic data cleaning: (a) convert price to numeric. Hint: you may want to fix its string representation first! (b) remove entries with missing or invalid price, bedrooms, and other variables you need below

3. Analyze the distribution of price. Does it look like normal? Does it look like something else? Does it suggest you should make a log-transformation? Hint: consult lecture notes Section 2.1.8 Feature Transformations.

4. Convert the number of bedrooms into another variable with a limited number of categories only, such as 1, 2, 3, 4+, and use these categories in the models below. 2.2 Model: 1. Run a linear regression where you explain the listing price with number of bedrooms where bedrooms uses the categories you made above. Interpret the results, including R2. Hint: if 1-BR is the reference category, the effect for 2BR should be 1500 (but it depends on how exactly did you clean data).

2. Now repeat the process with the model where you analyze log price instead of price. Interpret the results. Which model behaves better in the sense of R2? Hint: if you cleaned the data the same way as me, you should see R2 = 0.203. For the following tasks, use either log(price) or price, depending on your answer here: 3. Finally we just add two more variables to the model: room type & accommodates. While room type only contains three values, the other two contain many different categories. Recode these as accommodates: "1", "2", "3", "4 and more" Run this model. Interpret and comment the more interesting/important results. Do not forget to mention what are the relevant reference categories, and what does R2 show.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Mathematical Applications For The Management, Life And Social Sciences

Authors: Ronald J. Harshbarger, James J. Reynolds

12th Edition

978-1337625340

More Books

Students also viewed these Mathematics questions

Question

Use translations to graph f. f(x) = x-/2 +1

Answered: 1 week ago