Question
Using the provided dataset that represents the Titanic disaster, code both an unsupervised clustering algorithm to describe the data and a simple supervised classification prediction
Using the provided dataset that represents the Titanic disaster, code both an unsupervised clustering algorithm to describe the data and a simple supervised classification prediction to determine who might survive.
This is the code I have thus far:
#import library import matplotlib.pyplot as plt from matplotlib import style style.use('ggplot') import numpy as np from sklearn.cluster import KMeans from sklearn import preprocessing import pandas as pd import numpy as np from sklearn.preprocessing import LabelEncoder from sklearn.cluster import KMeans from mpl_toolkits.mplot3d import Axes3D %matplotlib inline from sklearn import datasets
In[21]:
#import file df = pd.read_excel('titanic.xls') df.head()
Out[21]:
pclasssurvivednamesexagesibspparchticketfarecabinembarkedboatbodyhome.dest011Allen, Miss. Elisabeth Waltonfemale29.00000024160211.3375B5S2NaNSt Louis, MO111Allison, Master. Hudson Trevormale0.916712113781151.5500C22 C26S11NaNMontreal, PQ / Chesterville, ON210Allison, Miss. Helen Lorainefemale2.000012113781151.5500C22 C26SNaNNaNMontreal, PQ / Chesterville, ON310Allison, Mr. Hudson Joshua Creightonmale30.000012113781151.5500C22 C26SNaN135.0Montreal, PQ / Chesterville, ON410Allison, Mrs. Hudson J C (Bessie Waldo Daniels)female25.000012113781151.5500C22 C26SNaNNaNMontreal, PQ / Chesterville, ON
In[22]:
#cleaning up data df.apply(pd.to_numeric, errors='coerce').fillna(0).astype(int) df.fillna(0, inplace=True) df.head()
Out[22]:
pclasssurvivednamesexagesibspparchticketfarecabinembarkedboatbodyhome.dest011Allen, Miss. Elisabeth Waltonfemale29.00000024160211.3375B5S20.0St Louis, MO111Allison, Master. Hudson Trevormale0.916712113781151.5500C22 C26S110.0Montreal, PQ / Chesterville, ON210Allison, Miss. Helen Lorainefemale2.000012113781151.5500C22 C26S00.0Montreal, PQ / Chesterville, ON310Allison, Mr. Hudson Joshua Creightonmale30.000012113781151.5500C22 C26S0135.0Montreal, PQ / Chesterville, ON410Allison, Mrs. Hudson J C (Bessie Waldo Daniels)female25.000012113781151.5500C22 C26S00.0Montreal, PQ / Chesterville, ON
In[23]:
print(df)
pclass survived name \ 0 1 1 Allen, Miss. Elisabeth Walton 1 1 1 Allison, Master. Hudson Trevor 2 1 0 Allison, Miss. Helen Loraine 3 1 0 Allison, Mr. Hudson Joshua Creighton 4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) ... ... ... ... 1304 3 0 Zabour, Miss. Hileni 1305 3 0 Zabour, Miss. Thamine 1306 3 0 Zakarian, Mr. Mapriededer 1307 3 0 Zakarian, Mr. Ortin 1308 3 0 Zimmerman, Mr. Leo sex age sibsp parch ticket fare cabin embarked boat \ 0 female 29.0000 0 0 24160 211.3375 B5 S 2 1 male 0.9167 1 2 113781 151.5500 C22 C26 S 11 2 female 2.0000 1 2 113781 151.5500 C22 C26 S 0 3 male 30.0000 1 2 113781 151.5500 C22 C26 S 0 4 female 25.0000 1 2 113781 151.5500 C22 C26 S 0 ... ... ... ... ... ... ... ... ... ... 1304 female 14.5000 1 0 2665 14.4542 0 C 0 1305 female 0.0000 1 0 2665 14.4542 0 C 0 1306 male 26.5000 0 0 2656 7.2250 0 C 0 1307 male 27.0000 0 0 2670 7.2250 0 C 0 1308 male 29.0000 0 0 315082 7.8750 0 S 0 body home.dest 0 0.0 St Louis, MO 1 0.0 Montreal, PQ / Chesterville, ON 2 0.0 Montreal, PQ / Chesterville, ON 3 135.0 Montreal, PQ / Chesterville, ON 4 0.0 Montreal, PQ / Chesterville, ON ... ... ... 1304 328.0 0 1305 0.0 0 1306 304.0 0 1307 0.0 0 1308 0.0 0 [1309 rows x 14 columns]
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started