import numpy as np import pandas as pd import matplotlib pyplot as plt from sklearn cluster import KMeans Sample data ( replace this with your actual data ) data Passenger Count 2 7 2 7 1 , 2 9 1 3 1 , 5 4 1 5 , 3 5 1 5 6 , 3 4 0 9 0 , Price Category Code Low Fare True , True, True, False, False , Price Category Code Other False , False, False, True, True , GEO Summary Domestic True , True, True, False, False , GEO Summary International False , False, False, True, True , Cluster 0 , 0 , 0 , 3 , 3 Create a DataFrame df pd DataFrame ( data ) Extract features for clustering ( you can select specific columns based on your requirement ) features df Passenger Count , Price Category Code Low Fare , Price Category Code Other , GEO Summary Domestic , GEO Summary International Fit K means clustering model kmeans KMeans ( n clusters 4 ) Change the number of clusters as per your analysis kmeans fit ( features ) Add cluster labels to the DataFrame df ' Cluster ' kmeans labels Visualize the clusters for all pairs of features fig, axs plt subplots ( 2 , 3 , figsize ( 1 8 , 1 2 ) ) Flatten the axs array for easy iteration axs axs flatten ( ) Initialize a counter for the subplot index subplot index 0 Plot each pair of features with color coded clusters for i , feature 1 in enumerate ( features columns ) for j , feature 2 in enumerate ( features columns ) if i j This ensures that each pair is plotted only once ax axs subplot index ax scatter ( df feature 1 , df feature 2 , c df ' Cluster ' , cmap 'viridis', s 5 0 , alpha 0 7 ) ax set xlabel ( feature 1 ) ax set ylabel ( feature 2 ) subplot index 1 Increment the subplot index Plot cluster centroids for each pair of features for cluster in range ( kmeans n clusters ) subplot index 0 Reset the subplot index for centroids for i , feature 1 in enumerate ( features columns ) for j , feature 2 in enumerate ( features columns ) if i j This ensures that each pair is plotted only once ax axs subplot index ax scatter ( kmeans cluster centers cluster i , kmeans cluster centers cluster j , c 'red', marker ' x ' , s 2 0 0 , label f'Cluster cluster ' ) subplot index 1 Increment the subplot index plt tight layout ( ) plt show ( ) For the above code i am getting the below error name IndexError , message index 6 is out of bounds for axis 0 with size 6 , stack IndexError Traceback ( most recent call last ) Cell In 1 7 8 , line 4 2 4 0 for j , feature 2 in enumerate ( features columns ) 4 1 if i j This ensures that each pair is plotted only once 4 2 ax axs subplot index 4 3 ax scatter ( df feature 1 , df feature 2 , c df ' Cluster ' , cmap 'viridis', s 5 0 , alpha 0 7 ) 4 4 ax set xlabel ( feature 1 ) IndexError index 6 is out of bounds for axis 0 with size 6 Kindly resolve this also i am sharing the dataframe information Dimensions of DataFrame ( rows , columns ) ( 5 , 6 ) Column labels Index ( ' Passenger Count', 'Price Category Code Low Fare', 'Price Category Code Other', 'GEO Summary Domestic', 'GEO Summary International', 'Cluster' , dtype 'object' )

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Aug 18, 2024

import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.cluster import KMeans # Sample data ( replace this with your actual

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

# Sample data

(

replace this with your actual data

)

data

= {

"Passenger Count":

[27271, 29131, 5415, 35156, 34090],

"Price Category Code

_

Low Fare":

[

True

,

True, True, False, False

],

"Price Category Code

_

Other":

[

False

,

False, False, True, True

],

"GEO Summary

_

Domestic":

[

True

,

True, True, False, False

],

"GEO Summary

_

International":

[

False

,

False, False, True, True

],

"Cluster":

[0, 0, 0, 3, 3]

}

# Create a DataFrame

=

.

DataFrame

(

data

)

# Extract features for clustering

(

you can select specific columns based on your requirement

)

features

=

[["

Passenger Count", "Price Category Code

_

Low Fare", "Price Category Code

_

Other", "GEO Summary

_

Domestic", "GEO Summary

_

International"

]]

# Fit K

-

means clustering model

kmeans

=

KMeans

(

_

clusters

= 4)

# Change the number of clusters as per your analysis

kmeans.fit

(

features

)

# Add cluster labels to the DataFrame

['

Cluster

'] =

kmeans.labels

_

# Visualize the clusters for all pairs of features

fig, axs

=

plt

.

subplots

(2, 3,

figsize

= (18, 12))

# Flatten the axs array for easy iteration

axs

=

axs.flatten

()

# Initialize a counter for the subplot index

subplot

_

index

= 0

# Plot each pair of features with color

-

coded clusters

for i

,

feature

1

in enumerate

(

features

.

columns

)

for j

,

feature

2

in enumerate

(

features

.

columns

)

if i

<

j: # This ensures that each pair is plotted only once

=

axs

[

subplot

_

index

]

.

scatter

(

[

feature

1],

[

feature

2],

=

['

Cluster

'],

cmap

=

'viridis', s

= 50,

alpha

= 0.7)

.

set

_

xlabel

(

feature

1)

.

set

_

ylabel

(

feature

2)

subplot

_

index

+ = 1

# Increment the subplot index

# Plot cluster centroids for each pair of features

for cluster in range

(

kmeans

.

_

clusters

)

subplot

_

index

= 0

# Reset the subplot index for centroids

for i

,

feature

1

in enumerate

(

features

.

columns

)

for j

,

feature

2

in enumerate

(

features

.

columns

)

if i

<

j: # This ensures that each pair is plotted only once

=

axs

[

subplot

_

index

]

.

scatter

(

kmeans

.

cluster

_

centers

_[

cluster

] [

],

kmeans.cluster

_

centers

_[

cluster

] [

],

=

'red', marker

='

',

= 200,

label

=

f'Cluster

{

cluster

}')

subplot

_

index

+ = 1

# Increment the subplot index

plt

.

tight

_

layout

()

plt

.

show

()

For the above code i am getting the below error

{

"name": "IndexError",

"message": "index

6

is out of bounds for axis

0

with size

6 ",

"stack":

" - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

IndexError Traceback

(

most recent call last

)

Cell In

[178],

line

42

40

for j

,

feature

2

in enumerate

(

features

.

columns

)

41

if i

<

j: # This ensures that each pair is plotted only once

- - - > 42

=

axs

[

subplot

_

index

]

43

.

scatter

(

[

feature

1],

[

feature

2],

=

['

Cluster

'],

cmap

=

'viridis', s

= 50,

alpha

= 0.7)

44

.

set

_

xlabel

(

feature

1)

IndexError: index

6

is out of bounds for axis

0

with size

6 "

Kindly resolve this

.

also i am sharing the dataframe information : Dimensions of DataFrame

(

rows

,

columns

)

(5, 6)

Column labels: Index

(['

Passenger Count', 'Price Category Code

_

Low Fare',

'Price Category Code

_

Other', 'GEO Summary

_

Domestic',

'GEO Summary

_

International', 'Cluster'

],

dtype

=

'object'

)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

Step: 3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Spatial Databases With Application To GIS

Authors: Philippe Rigaux, Michel Scholl, Agnès Voisard

1st Edition

1558605886, 978-1558605886

More Books

Students also viewed these Databases questions

Question

★★★★★

Western States Supply, Inc. (WSS), consists of three divisionsCalifornia, Northwest, and Southwestthat operate as if they were independent companies. Each division has its own sales force and...

Answered: 1 week ago

Question

★★★★★

A thin-walled steel tube of rectangular cross section (see figure) has centerline dimensions b = 150 mm and h = 100 mm. The wall thickness t is constant and equal to 6.0 mm. (a) Determine the shear...

Answered: 1 week ago

Question

★★★★★

Let X and Y have a common negative binomial distribution. Find the conditional probability P(X = j | X + Y = k} and show that the identity II, (12.16) now becomes obvious without any calculations. 11

Answered: 1 week ago

Question

★★★★★

Hardrock Concrete's owner has decided to increase the capacity at his smallest plant. Instead of producing 30 loads of concrete per day at plant 3, that plant's capacity is doubled to 60 loads. Find...

Answered: 1 week ago

Question

★★★★★

Lydon Chaundy has the following financial data. Investment Assets at Year End $475,000 Investment Assets at Beginning of the Year $350,000 Savings Made During the Year by Mike $27,000 Employer Match...

Answered: 1 week ago

Question

★★★★★

Given is the Income Statement for the year ended December 31, 20xx, Statement of Retained Earnings for the year ended December 31, 20XX and Comparative Balance Sheets for 20XW and 20XX of Maris...

Answered: 1 week ago

Question

★★★★★

After reading Case 1 4 . 1 ( Gender and Leadership. page 4 1 0 ) , respond to the followng questions Q 1 : What advancement barriers did Lisa encounter? Q 2 : What should the firm\'s top executives...

Answered: 1 week ago

Question

★★★★★

b) Consider a 36 MW steam power plant operates on a simple ideal Rankine cycle between the pressure limits of 25 bar and 0.5 bar. The dryness fraction of the steam at the turbine exit is 90%. The...

Answered: 1 week ago

Question

★★★★★

A few years ago ValuChain Company sold a $1,000 par value, noncallable bond that now has 15 years to maturity and a 5.00% annual coupon that is paid semiannually. The bond currently sells for $920...

Answered: 1 week ago

Question

★★★★★

Part D Consider the function, m (x 3) f(x) = (x+5) (x 8) where m represents a real number. #9 Using what you know about rational functions, what is the equation of the horizontal asymptote? #10 Use...

Answered: 1 week ago

Question

★★★★★

Forten Company's current year Income statement, comparative balance sheets, and additional Information follow. For the year. (1) all sales are credit sales, (2) all credits to Accounts Receivable...

Answered: 1 week ago

Question

★★★★★

Develop successful mentoring programs. page 418

Answered: 1 week ago

Question

★★★★★

List the major elements that contribute to perceptions of justice and how to apply these in organizational contexts involving discipline and dismissal. page 445

Answered: 1 week ago

Question

★★★★★

Relate how assessment of personality type, work behaviors, and job performance can be used for employee development. page 406

Answered: 1 week ago

Previous Question Next Question