Answered step by step
Verified Expert Solution
Question
1 Approved Answer
import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.cluster import KMeans # Sample data ( replace this with your actual
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
# Sample data replace this with your actual data
data
"Passenger Count":
"Price Category CodeLow Fare": True True, True, False, False
"Price Category CodeOther": False False, False, True, True
"GEO SummaryDomestic": True True, True, False, False
"GEO SummaryInternational": False False, False, True, True
"Cluster":
# Create a DataFrame
df pdDataFramedata
# Extract features for clustering you can select specific columns based on your requirement
features dfPassenger Count", "Price Category CodeLow Fare", "Price Category CodeOther", "GEO SummaryDomestic", "GEO SummaryInternational"
# Fit Kmeans clustering model
kmeans KMeansnclusters # Change the number of clusters as per your analysis
kmeans.fitfeatures
# Add cluster labels to the DataFrame
dfCluster kmeans.labels
# Visualize the clusters for all pairs of features
fig, axs pltsubplots figsize
# Flatten the axs array for easy iteration
axs axs.flatten
# Initialize a counter for the subplot index
subplotindex
# Plot each pair of features with colorcoded clusters
for i feature in enumeratefeaturescolumns:
for j feature in enumeratefeaturescolumns:
if i j: # This ensures that each pair is plotted only once
ax axssubplotindex
axscatterdffeature dffeature cdfCluster cmap'viridis', s alpha
axsetxlabelfeature
axsetylabelfeature
subplotindex # Increment the subplot index
# Plot cluster centroids for each pair of features
for cluster in rangekmeansnclusters:
subplotindex # Reset the subplot index for centroids
for i feature in enumeratefeaturescolumns:
for j feature in enumeratefeaturescolumns:
if i j: # This ensures that each pair is plotted only once
ax axssubplotindex
axscatterkmeansclustercentersclusteri kmeans.clustercentersclusterj c'red', markerx s labelf'Cluster cluster
subplotindex # Increment the subplot index
plttightlayout
pltshow
For the above code i am getting the below error
"name": "IndexError",
"message": "index is out of bounds for axis with size
"stack":
IndexError Traceback most recent call last
Cell In line
for j feature in enumeratefeaturescolumns:
if i j: # This ensures that each pair is plotted only once
ax axssubplotindex
axscatterdffeature dffeature cdfCluster cmap'viridis', s alpha
axsetxlabelfeature
IndexError: index is out of bounds for axis with size
Kindly resolve this also i am sharing the dataframe information : Dimensions of DataFrame rows columns:
Column labels: IndexPassenger Count', 'Price Category CodeLow Fare',
'Price Category CodeOther', 'GEO SummaryDomestic',
'GEO SummaryInternational', 'Cluster'
dtype'object'
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started