Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Predicting Housing Median Prices. The file BostonHousing.xls contains information on over 500 census tracts in Boston, where for each tract 14 variables are recorded. The

Predicting Housing Median Prices. The file BostonHousing.xls contains information on over 500 census tracts in Boston, where for each tract 14 variables are recorded. The last column (CAT.MEDV) was derived from MEDV, such that it obtains the value 1 if MEDV>30 and 0 otherwise. Consider the goal of predicting the median value (MEDV) of a tract, given the information in the first 13 columns.

Partition the data into training (60%) and validation (40%) sets.

a. Perform a k-NN prediction with all 13 predictors (ignore the CAT.MEDV column), trying values of k from 1 to 5. Make sure to normalize the data (click “normalize input data”). What is the best k chosen? What does it mean?

b. Predict the MEDV for a tract with the following information, using the best k:

(Copy this table with the column names to a new worksheet and then in “Score new data” choose “from worksheet.”)

c. Why is the error of the training data zero? d. Why is the validation data error overly optimistic compared to the error rate when

applying this k-NN predictor to new data?

e. If the purpose is to predict MEDV for several thousands of new tracts, what would be the disadvantage of using k-NN prediction? List the operations that the algorithm goes through in order to produce each prediction.

(Shmueli 146-147)

CRIM ZN 0.2 TAX 307 O PTRATIO 21 INDUS 7 B 360 CHAS O LSTAT 10 NOX 0.538 RM 6 AGE 62 DIS 4.7 RAD 4

Step by Step Solution

3.41 Rating (148 Votes )

There are 3 Steps involved in it

Step: 1

KNearest Neighbor prediction method In this method the Training Set is used to predict the value of ... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Data Analysis And Decision Making

Authors: Christian Albright, Wayne Winston, Christopher Zappe

4th Edition

538476125, 978-0538476126

More Books

Students also viewed these Mathematics questions

Question

What are American Depository Receipts (ADRs)?

Answered: 1 week ago

Question

Under what conditions is the following SQL statement valid?

Answered: 1 week ago