Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

To begin, load the Titanic dataset from Seaborn and focus on the relevant variables. The target variable for this analysis is 'survived', which indicates if

To begin, load the Titanic dataset from Seaborn and focus on the relevant variables. The target variable for this analysis is 'survived', which indicates if a passenger survived (1) or died (0). The other variables will serve as input features.
In a Markdown cell, identify the categorical and numerical features by examining the possible values each feature can assume. Remember, there are four categorical and four numerical features.
Next, partition the data into training and testing sets, ensuring to stratify based on the target variable to maintain the distribution. Extract 20% of the data for testing. The resulting DataFrames should be named df_train and df_test. Print the dimensions of these datasets.
Using the training data, calculate the probability of each target label, P(yk).
For the categorical features in the training data, calculate and show P(xjyk), the probability of each feature given the target label.
Identify and display the means and standard deviations of the continuous features. The probability of observing any value for these features is assumed to follow the Normal Distribution:
P(xjyk)=N(xj,xj)
where j and j are calculated from each feature j and target label yk.
Construct a prediction function that calculates:
P(ykx(i))=argmaxk in {0,1}[P(yk)\times jfP(xj(i)yk)]
where yk=0 means Died and yk=1 means Survived. This function should calculate the above expression for k=0 and k=1, and return both predictions for each instance considered.
def calculate_class_predictions(data, return_normed_probs= False):
predicted_probs_per_instance = np.empty(shape=(0,2))
## Construct function here
predicted_probs_per_instance = np.vstack([predicted_probs_per_instance, np.array(prob_died, prob_survived)])
return(predicted_probs_per_instance)
Using your function, calculate the predictions from the training data and then compute the confusion matrix and accuracy score.
Repeat the prediction using the test data and calculate the confusion matrix and accuracy score for this set as well.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

MySQL/PHP Database Applications

Authors: Jay Greenspan, Brad Bulger

1st Edition

978-0764535376

More Books

Students also viewed these Databases questions

Question

What does stickiest refer to in regard to social media

Answered: 1 week ago

Question

2. Are my sources up to date?

Answered: 1 week ago