Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Nov 23, 2022

1. Regarding bias and variance, which of the followingstatements are true? (Here high and low are relative tothe ideal model.) Models which underfit have a

1. Regarding bias and variance, which of the followingstatements are true? (Here ‘high’ and ‘low’ are relative tothe ideal model.)

Models which underfit have a high variance.

Models which underfit have a low variance.

Models which overfit have a high bias.

Models which overfit have a low bias.

2. Some of the problems below are best addressed using asupervised learning algorithm and the others with an unsupervisedlearning algorithm. Assuming that an appropriate dataset isavailable for your algorithm to learn from, which of the followingcould you apply supervised learning given only what is provided?(Select all that apply.)

Answer,

Given historical data of children’ ages and heights, predictchildren's height as a function of their age

In farming, given data on crop yields over the last 45 years,predict next year's crop yields.

Examine a large collection of emails that are known to be spamemail to discover if there are sub-types of spam mail.

Given a large dataset of medical records from patients sufferingfrom heart disease, try to learn whether there might be differentclusters of such patients for which we might tailor separatetreatments.

3. Consider the Wage Data described on pages 1 and 2 of theISLR textbook. Suppose we developed a model to predict Wage solelyusing the trends visible in Figure 1.1 based on an employee's Age,Year, and Education Level. What is the most likely order for thepredicted wages for the following individuals (from highest tolowest)?

A. Age 70, Year 2006, Education Level 3

B. Age 50, Year 2008, Education Level 4

C. Age 30, Year 2004, Education Level 4

Options:

A,B,C

B,A,C

A,C,B

C,A,B

C,B,A

B,C,A

4. According to the K-nearest neighborsmethod,______________

when K=1, the KNN training error is 0

a model with K=4 is more flexible than one with K=2.

in classification problems, as model flexibility increases,the training error rete consistently increases.

as K rises, the statistical method becomes more flexible

5. Which of the following is a TRUE statement aboutbias-variance tradeoff?

The test mean squared error (MSE) almost always take aninverted U shape with respect to model flexibility (itincreases first with increasing flexibility than startsdecreasing)

In order to minimize the expected test error (MSE) , weneed to select a statistical learning method that simultaneouslyachieves low variance and low bias.

If the true data generating process is linear ( the true model fis linear), then applying a very flexible model to the trainingdata set will always generate lower test MSE in the test that thanthe linear model.

If we use a very flexible model in the training data, weguarantee lower bias in the test data, hence we always get theminimum possible test mean squared error (MSE) with the mostflexible model due to lower bias in the test data.

Step by Step Solution

★★★★★

3.45 Rating (148 Votes )

There are 3 Steps involved in it

Step: 1

The detailed answer for the above question is provided below Answer 1 Models which underfit have a high variance Variance is a measure of how far a set of numbers are spread out from their average val... blur-text-image