Answered step by step
Verified Expert Solution
Question
1 Approved Answer
DUE 2 0 MAY PLEASE GIVE CORRECT CODE WITH DETAILED EXPLANATION I WANT TO UNDERSTAND ALSO NOT JUST SEE A CODE The goal of this
DUE MAY PLEASE GIVE CORRECT CODE WITH DETAILED EXPLANATION I WANT TO UNDERSTAND ALSO NOT JUST SEE A CODE
The goal of this homework assignment is to explore the KMeans algorithm using the given dataset airlines.csv Throughout this assignment, you will perform various tasks including data description, data preprocessing, exploratory data analysis, and determining the optimal number of clusters using KMeans.
Tasks
Data Description
The provided raw data is in the airlines.csv file.
The description of the raw data is as follows:
id: Unique ID
balance: Number of miles eligible for award travel
qualmile: Number of miles counted as qualifying for Topflight status.
ccmiles: Number of miles earned with freq. flyer credit card in the past months:
ccmiles: Number of miles earned with Rewards credit card in the past months:
ccmiles: Number of miles earned with Small Business credit card in the past months:
: under
:
:
:
: over
bonusmiles: Number of miles earned from nonflight bonus transactions in the past months.
bonustrans: Number of nonflight bonus transactions in the past months.
flightmilesmo: Number of flight miles in the past months.
flighttrans: Number of flight transactions in the past months.
dayssinceenrolled: Number of days since enrolled in flier program.
award: whether that person had an award flight free flight or not.
Check for Missing Values
Perform data preprocessing to check for any missing values in the dataset.
Analyze Features
Create histograms to understand the distribution of different features in the dataset.
Calculate Percentage of Customers withwithout Award
Find the percentage of customers who do not have an award flight and those who do have an award flight.
Correlation Analysis
Find which feature is correlated with the balance feature.
Draw a correlation heatmap to visualize the correlations among different features.
Plotting
Plot the relationship between frequent flying bonuses and nonflight bonus transactions.
Determining Optimal Number of Clusters
Apply MinMaxScaler to normalize the data.
Use the Elbow Method and Silhouette Score to find the ideal number of clusters for KMeans algorithm.
this is a short part from the information in the airline.csv file:
idbalance,qualmiles,ccmiles,ccmiles,ccmiles,bonusmiles,bonustrans,flightmilesmoflighttransdayssinceenroll,award
the file has lines in total with similar information, I attached a photo also
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started