Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

(5 pt) The following correlation matrix was created from the Sleep- In-Mammals dataset. There are several strong or medium strength relationships between these features. Look

(5 pt) The following correlation matrix was created from the Sleep- In-Mammals dataset. There are several strong or medium strength relationships between these features. Look at Exposure, rated from 1-5 with 5 being most exposed and its correlation with Gestation and TotalSleep. Which of the following statements are most correct?

Animals that are most exposed (those that do not sleep in dens) tend to need more sleep.

Animals that are most exposed, tend to have longer gestation periods, keeping their young inside the mother longer for protection.

ANS:

Given the follow code

tests = [ Test 1, Midterm, Test 2, Final]

student1 = Series ([80, 92, 95, 83], index = tests)

student2 = Series ([92, 88, 99, 80], index = tests)

student3 = Series ([87, 91, 85, 93], index = tests)

tests_df = DataFrame ({S1: student1, S2: student2,

S3: student3})

(6 pts) What command will give the mean score of each of the exams using one DataFrame function call. Display the results

ANS:

(2 pts) Display the results

ANS:

(16 pt) The K-means clustering of the Iris Dataset in Lab 5: PartA produced the confusion matrix shown below. The flower species are Versicolor (0), Setosa (1), and Virginica (0).

Row 0: Actual Versicolor

Row 1: Actual Setosa

Row 2: Actual Virginica

Predicted Class

Versicolor

Setosa

Virginica

Total

Actual Class

Versicolor

48

0

2

C1: 50

Setosa

0

50

0

C2: 50

Virginica

14

0

36

C3: 50

Total

62

50

38

150

You may leave the answers as fractions

Please calculate the following stats:

TC1: True Versicolor =

TC2: True Setosa =

TC3: True Virginica =

FC1C2: Setosa Classified as Versicolor =

FC1C3: Virginica Classified as Versicolor =

FC2C1: Versicolor Classified as Setosa =

FC2C3: Virginica Classified as Setosa =

FC3C1: Versicolor Classified as Virginica =

FC3C2: Setosa Classified as Virginica =

Sensitivity or Recall: TPR

TC1R: True Versicolor Rate =

TC2R: True Setosa Rate =

TC3R: True Virginica Rate =

Precision: PPV

C1PV: Versicolor Predictive Value =

C2PV: Setosa Predictive Value =

C3PV: Virginica Predictive Value =

Accuracy: ACC =

(8 pt) Which Machine Learning algorithm uses the ExpectationMaximization (EM) algorithm? List the high-level steps for this machine-learning algorithm. Just copy the algorithm from the PDSH book PDF file.

ANS:

(6 pts) One way to reduce the number of features in a dataset for which you are performing linear regression is to use backward stepwise regression. You start out with a full model and systematically drop features one at a time from the model using a logical procedure. We used this feature reduction method along with statmodels OLS regression in Lab 6. Which of the statistics returned from OLS were used to decide which feature to drop?

Drop the feature with the highest R-squared value.

Drop the feature with the lowest MSE value

Drop the feature with the highest coefficient p-value.

ANS:

(6 pts) A statistical model is developed by training the machine learning algorithm using training data. In most cases, this is just a subset of all the possible data for the problem for which the model is being developed. We want to develop a model that also works well with unseen data, called test data. The models that we build can overfit or underfit the data. With this in mind, which of the following statements is false:

Using Ridge regression versus linear regression on a dataset can improve predictability by reducing overfitting.

A model that underfits the data, does not predict well on the test data, and does not predict well on the training data because there are too many features in the model.

Overfitting occurs when the model fits too well on the training data, and the model does not fit as well on the test data.

Eliminating some of a models less significant features can help reduce overfitting.

ANS:

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Systems An Application Oriented Approach Complete Version

Authors: Michael Kifer, Arthur Bernstein, Richard Lewis

2nd Edition

0321268458, 978-0321268457

More Books

Students also viewed these Databases questions

Question

What is management growth? What are its factors

Answered: 1 week ago