Question
Merging and Cleaning(20 Points) The first objective is to combine those files and stack them asthree large files, one for each time period.Run basic EDA
Merging and Cleaning(20 Points)
The first objective is to combine those files and stack them asthree large files, one for each time period.Run basic EDA and descriptive statistics on some columns and clean any obvious outliersfrom each time period. Make sure that no more than 1% of the data are removedfrom within each time periodin this process. Clearly write the details of outlier detection and descriptive analysis.
Analysis(60 Points)
This section isfurtherbroken down into two parts:
Part A: (30points)
There are several numericcolumnslistedin the datasets.Use the tools of dimension reduction learnt during the course and condense the number of columns to smaller dimension for each time period separately.
Use the reduced dimensions to perform "grouping" of similar vehicles. Keep the number of groups between 5 and 8 for each time period. Clearly define groups based on their characteristics by running descriptive analytics on each group. Now compare the groups for the three time periods and point out any vehicles that jumped from one group to the other over time. Also explain what that jump means in your own words.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started