Question
Task: Clustering Using Titanic train.csv dataset - same as previous HWs: [15pt] Prepare the dataset for analysis. In this analysis, we use pclass, fare, age,
Task: Clustering
Using Titanic train.csv dataset - same as previous HWs:
[15pt] Prepare the dataset for analysis. In this analysis, we use "pclass", "fare", "age", "sex", "embarked" as input variables. Please perform necessary cleaning and transformation.
[45pt] Implement the 2 clustering algorithms - K-means and Hierarchical Clustering, and add the cluster labels to the dataset as a new column. Please set # of clusters = 3 for both algorithms.
[40pt] For K-mean algorithm, make an Elbow plot using K from 1 to 20 and calculate the WCSS. Please show the plot and answer the question "which K should we choose". You can modify the code I provided in the class - no need to write your own from scratch.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started