Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

7.3 Predicting Housing Median Prices. The file BostonHousing.csv contains information on over 500 census tracts in Boston, where for each tract multiple variables are recorded.

image text in transcribedimage text in transcribed

7.3 Predicting Housing Median Prices. The file BostonHousing.csv contains information

on over 500 census tracts in Boston, where for each tract multiple variables

are recorded. The last column (CAT.MEDV) was derived from MEDV, such that it

obtains the value 1 if MEDV > 30 and 0 otherwise. Consider the goal of predicting

the median value (MEDV) of a tract, given the information in the first 12 columns.

Partition the data into training (60%) and validation (40%) sets.

a. Perform a k-NN prediction with all 12 predictors (ignore the CAT.MEDV column),

trying values of k from 1 to 5. Make sure to normalize the data, and choose

function knn() from package class rather than package FNN. To make sure R is

using the class package (when both packages are loaded), use class::knn(). What

is the best k? What does it mean?

b. Predict the MEDV for a tract with the following information, using the best k:

CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO LSTAT

0.2 0 7 0 0.538 6 62 4.7 4 307 21 10

c. If we used the above k-NN algorithm to score the training data, what would be

the error of the training set?

d. Why is the validation data error overly optimistic compared to the error rate when

applying this k-NN predictor to new data?

e. If the purpose is to predict MEDV for several thousands of new tracts, what would

be the disadvantage of using k-NN prediction? List the operations that the algorithm

goes through in order to produce each prediction.

...Please answer with R code explained, below is a link to the spreadsheet, should be saved to excel as csv, for the sake of R

https://docs.google.com/spreadsheets/d/1QMlmZsaijNvZ8UUituKjlCQ-IzpJ5brSop-WJvk9iZo/edit?usp=sharing

image text in transcribedimage text in transcribed
-.-----l, "i\"'D """"'\" V- \"' \""\"' '- 'V V' "\""\"' \""*" "' \""\"\""'\"" u" \"""'7 \"\"w "\""'"" inction karmO from package class rather than package FNN. To make sure R is using the class package (when both packages are loaded), use class::knn(). What is the best Is? What does it mean? b. Predict the MEDV for a tract with the following information, using the best k: CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO LSTAT 0.2 0 7 0 0.538 6 62 4.7 4 307 21 10 c. If we used the above kNN algorithm to score the training data, what would be the error of the training set? (1. Why is the validation data error overly optimistic compared to the error rate when applying this kNN predictor to new data? e. If the purpose is to predict MEDV for several thousands of new tracts, what would be the disadvantage of using kNN prediction? List the operations that the algo rithm goes through in order to produce each prediction. 7.3 differences and their reason. Predicting Housing Median Prices. The file BostonHousing.csv contains infor mation on over 500 census tracts in Boston, where for each tract multiple variables are recorded. The last column (CATMEDV) was derived from MEDV, such that it obtains the value 1 if MEDV > 30 and 0 otherwise. Consider the goal of predicting the median value (MEDV) of a tract, given the information in the rst 12 columns. Partition the data into training (60%) and validation (40%) sets. 3. Perform a kNN prediction with all 12 predictors (ignore the CATMEDV col umn), trying values of is from 1 to 5. Make sure to normalize the data, and choose function [em/LO from package class rather than package FNN. To make sure R is using the class package (when both packages are loaded), use class::knn(). What is the best Is? What does it mean? b. Predict the MEDV for a tract with the following information, using the best is: CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO LSTAT 0.2 0 7 0 0.538 6 62 4.7 4 307 21 10 c. If we used the above kNN algorithm to score the training data, what would be

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Paradoxes In Mathematics

Authors: Stanley J Farlow

1st Edition

0486791734, 9780486791739

More Books

Students also viewed these Mathematics questions

Question

What factors contribute to (or cause) inventory shrinkage? 55

Answered: 1 week ago