Answered step by step
Verified Expert Solution
Question
1 Approved Answer
To begin, load the Titanic dataset from Seaborn and focus on the relevant variables. The target variable for this analysis is 'survived', which indicates if
To begin, load the Titanic dataset from Seaborn and focus on the relevant variables. The target variable for this analysis is 'survived', which indicates if a passenger survived or died The other variables will serve as input features.
In a Markdown cell, identify the categorical and numerical features by examining the possible values each feature can assume. Remember, there are four categorical and four numerical features.
Next, partition the data into training and testing sets, ensuring to stratify based on the target variable to maintain the distribution. Extract of the data for testing. The resulting DataFrames should be named dftrain and dftest. Print the dimensions of these datasets.
Using the training data, calculate the probability of each target label, Pyk
For the categorical features in the training data, calculate and show Pxjyk the probability of each feature given the target label.
Identify and display the means and standard deviations of the continuous features. The probability of observing any value for these features is assumed to follow the Normal Distribution:
PxjykNxjxj
where j and j are calculated from each feature j and target label yk
Construct a prediction function that calculates:
Pykxiargmaxk in Pyktimes jfPxjiyk
where yk means Died and yk means Survived. This function should calculate the above expression for k and k and return both predictions for each instance considered.
def calculateclasspredictionsdata returnnormedprobs False:
predictedprobsperinstance npemptyshape
## Construct function here
predictedprobsperinstance npvstackpredictedprobsperinstance, nparrayprobdied, probsurvived
returnpredictedprobsperinstance
Using your function, calculate the predictions from the training data and then compute the confusion matrix and accuracy score.
Repeat the prediction using the test data and calculate the confusion matrix and accuracy score for this set as well.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started