Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write Python code for any given csv dataset with some 2 0 - 2 5 columns and around 2 0 0 0 rows, Include explaination

Write Python code for any given csv dataset with some 20-25 columns and around 2000 rows, Include explaination and suggest alternatives - make it simple to understand and write.
#### 1. Data Understanding (5 marks)
a. Read the dataset (tab, csv, xls, txt, inbuilt dataset). What are the number of rows and no. of cols & types of variables (continuous, categorical etc.)?(1 MARK)
b. Calculate five-point summary for numerical variables (1 MARK)
c. Summarize observations for categorical variables no. of categories, % observations in each category. (1 marks)
d. Check for defects in the data such as missing values, null, outliers, etc. (2 marks)
#### 2. Data Preparation (15 marks)
a. Fix the defects found above and do appropriate treatment if any. (5 marks)
b. Visualize the data using relevant plots. Find out the variables which are highly correlated with target variable? (5 marks)
c. Do you want to exclude some variables from the model based on this analysis? What other actions will you take? (2 marks)
d. Split dataset into train and test (70:30). Are both train and test representative of the overall data? How would you ascertain this statistically? (3 marks)
### 3. Model Building (20 marks)
a. Fit a base model and observe the overall R- Squared, RMSE and MAPE values of the model. Please comment on whether it is good or not. (5 marks)
b. Check for multi-collinearity and treat the same. (3 marks)
c. How would you improve the model? Write clearly the changes that you will make before re-fitting the model. Fit the final model. (6 marks)
d. Write down a business interpretation/explanation of the model which variables are affecting the target the most and explain the relationship. Feel free to use charts or graphs to explain. (4 marks)
e. What changes from the base model had the most effect on model performance? (2 marks)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions