Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

I have data set named air traffic passenger statistics of which the columns are Index ( integer ) Activity Period ( integer ) Operating Airline

I have data set named air traffic passenger statistics of which the columns are
Index (integer)
Activity Period (integer)
Operating Airline (String)
Operating Airline IATA Code (String)
Published Airline (String)
Published Airline IATA Code (String)
GEO Summary (String)
GEO Region (String)
Activity Type Code (String)
Price Category Code (String)
Terminal (string + integer ex. Terminal 1)
Boarding Area (char ex.B)
Passenger Count (integer)
Adjusted Activity Type Code (string + integer)
Adjusted Passenger Count (integer)
Year (integer ex.2005)
Month (string ex.month)
Now I want to perform the clustering on the dataset, you can choose any 4 different clustering method\algorithm which is the best latest and provide the best result. Firstly, select any 4-5 feature which would be best and appropriate and justify it using mathematical expression of evaluation that why you had chosen that amongst all the columns. secondly you have to find the cluster using the elbow method or whichever you feel the best is to taken for finding the cluster size (dont take cluster size 2 take more than that) plot that in a well-defined labelled graph. Then perform the clustering on 4 different algorithms, plot the scatter plot showing the formation of cluster, data points, centroid. Create a high-level graph with proper labelling and then lastly find the Silhouette Score, Calinski-Harabasz index, and Davies-Bouldin index score. Create a table showing which is the best amongst 4 algo having row as algorithm name and column as score name also explain justify why that particular algorithm is the best.
Please perform the high-level clustering for all 4 algorithms, as the clustering code which are available on the internet is simple, I have already implemented that but I need more in that more enhanced version of all the 4 algorithms which shows more better clustering and output.
Note 1: please do not copy paste the online code or AI/GPT code. Write your own logic code and enhance the all 4 different algorithms.
Note 2: Provide all the explanation, all the code, table and its output.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

SQL Server T-SQL Recipes

Authors: David Dye, Jason Brimhall

4th Edition

1484200616, 9781484200612

More Books

Students also viewed these Databases questions

Question

What influences peoples choice of values?

Answered: 1 week ago

Question

32 co-16m CB=1.1 m B F

Answered: 1 week ago

Question

7 How can a culture encourage ethical (or unethical) behaviour?

Answered: 1 week ago