Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

upuluu uulu UiTU IJ IUUI TUI CA allaiyuub. CUSHI similarity is often a good choice when dealing with sparse non-binary data. What target level would

image text in transcribed
image text in transcribed
upuluu uulu UiTU IJ IUUI TUI CA allaiyuub. CUSHI similarity is often a good choice when dealing with sparse non-binary data. What target level would a 3-NN model using cosine similarity return for the query? 4. (Exercise 3 of chapter 5) the predictive task in this question is to predict the level of corruption in a country based on a range of macro-economic and social features. The table below lists some countries described by the following descriptive features: (1) "Life Exp.", the mean life expectancy at birth; (2) "Top-10 Income", the percentage of the annual income of the country that goes to the top 10% of earners; (3) "Infant Mort.", the number of infant death per 1,000 births; (4) "Mil. Spend", the percentage of GDP spent on the military: (5) "School Years", the mean number years spent in school by adult females (note: consider using Excel formula to simplify your calculation, a spreadsheet of the data is uploaded to eCourse with this homework) The target feature is the Corruption Perception Index (CPI). The CPI measures the perceived the levels of corruption in the public sector of countries and ranges from 0 (highly corrupt) to 10 (very clean). Country Life Top-10 Infant Mil. School CPI ID exp. income Mort. Spend Years Afghanistan 59.61 23.21 74.30 4.44 0.40 1.5171 Haiti 45.00 47.67 73.10 0.09 3.40 1.7999 Nigeria 51.30 38.23 82.60 1.07 4.10 2.4493 Egypt 70.48 26.58 19.60 1.86 5.30 2.8622 Argentina 75.77 32.30 13.30 0.76 10.10 2.9961 China 74.87 29.98 13.70 1.95 6.40 3.6356 42.93 28.80 29.85 27.23 28.49 22.07 24.79 25.40 22.18 27.81 Brazil 73.12 81.30 78.51 80.15 80.09 80.24 82.09 80.99 81.43 New Zealand 80.67 1.43 6.77 4.72 0.60 2.59 1.31 14.50 3.60 6.30 3.50 4.40 3.50 4.90 4.20 2.40 4.90 7 .20 12.50 13.70 11.50 13.00 12.00 14.20 11.50 12.80 12.30 3.7741 5.8069 7.1357 7.5360 7.7751 8.0461 8.6725 8.8442 9.2985 9.4627 1.42 Israel U.S.A Ireland U.K. Germany Australia 1.86 Canada 1.27 Sweden 1.13 We will use Russia as our query country for this question. The table below lists the descriptive features for Russia. Country Life Top-10 Infant Mil. School CPI exp. income Mort. Spend Years Russia 67.62 31.68 10.00 3.87 1 2.90 ? (a) What value would a 3-nearest neighbor prediction model using Euclidean distance return for the CPI of Russia? (b) What value would a weighted k-NN prediction model return for the CPI of Russia? Use k = 16 (i.e., the full dataset) and a weighting scheme of reciprocal of the squared Euclidean distance between the neighbor and the query. (c) The descriptive feature in this dataset are of different types. For example, some are percentage, others are measured in years, and others are measured in counts per 1,000. We should always consider normalizing our data, but it is particularly important to do this when the descriptive features are measured in different units. What value would a 3-nearest neighbor prediction model using Euclidean distance return for the CPI of Russia when the descriptive features have been normalized using range normalization? (d) What value would a weighted k-NN prediction model with k = 16 (i.e., the full dataset) and using a weighing scheme of the reciprocal of the squared Euclidean distance between the neighbor and the query, return for the CPI of Russia when it is applied to the range-normalized data? leThe actual 2011 CPI for Russia was 2.4488. Which of the predictions made was the most accurate? Why do you think this was? upuluu uulu UiTU IJ IUUI TUI CA allaiyuub. CUSHI similarity is often a good choice when dealing with sparse non-binary data. What target level would a 3-NN model using cosine similarity return for the query? 4. (Exercise 3 of chapter 5) the predictive task in this question is to predict the level of corruption in a country based on a range of macro-economic and social features. The table below lists some countries described by the following descriptive features: (1) "Life Exp.", the mean life expectancy at birth; (2) "Top-10 Income", the percentage of the annual income of the country that goes to the top 10% of earners; (3) "Infant Mort.", the number of infant death per 1,000 births; (4) "Mil. Spend", the percentage of GDP spent on the military: (5) "School Years", the mean number years spent in school by adult females (note: consider using Excel formula to simplify your calculation, a spreadsheet of the data is uploaded to eCourse with this homework) The target feature is the Corruption Perception Index (CPI). The CPI measures the perceived the levels of corruption in the public sector of countries and ranges from 0 (highly corrupt) to 10 (very clean). Country Life Top-10 Infant Mil. School CPI ID exp. income Mort. Spend Years Afghanistan 59.61 23.21 74.30 4.44 0.40 1.5171 Haiti 45.00 47.67 73.10 0.09 3.40 1.7999 Nigeria 51.30 38.23 82.60 1.07 4.10 2.4493 Egypt 70.48 26.58 19.60 1.86 5.30 2.8622 Argentina 75.77 32.30 13.30 0.76 10.10 2.9961 China 74.87 29.98 13.70 1.95 6.40 3.6356 42.93 28.80 29.85 27.23 28.49 22.07 24.79 25.40 22.18 27.81 Brazil 73.12 81.30 78.51 80.15 80.09 80.24 82.09 80.99 81.43 New Zealand 80.67 1.43 6.77 4.72 0.60 2.59 1.31 14.50 3.60 6.30 3.50 4.40 3.50 4.90 4.20 2.40 4.90 7 .20 12.50 13.70 11.50 13.00 12.00 14.20 11.50 12.80 12.30 3.7741 5.8069 7.1357 7.5360 7.7751 8.0461 8.6725 8.8442 9.2985 9.4627 1.42 Israel U.S.A Ireland U.K. Germany Australia 1.86 Canada 1.27 Sweden 1.13 We will use Russia as our query country for this question. The table below lists the descriptive features for Russia. Country Life Top-10 Infant Mil. School CPI exp. income Mort. Spend Years Russia 67.62 31.68 10.00 3.87 1 2.90 ? (a) What value would a 3-nearest neighbor prediction model using Euclidean distance return for the CPI of Russia? (b) What value would a weighted k-NN prediction model return for the CPI of Russia? Use k = 16 (i.e., the full dataset) and a weighting scheme of reciprocal of the squared Euclidean distance between the neighbor and the query. (c) The descriptive feature in this dataset are of different types. For example, some are percentage, others are measured in years, and others are measured in counts per 1,000. We should always consider normalizing our data, but it is particularly important to do this when the descriptive features are measured in different units. What value would a 3-nearest neighbor prediction model using Euclidean distance return for the CPI of Russia when the descriptive features have been normalized using range normalization? (d) What value would a weighted k-NN prediction model with k = 16 (i.e., the full dataset) and using a weighing scheme of the reciprocal of the squared Euclidean distance between the neighbor and the query, return for the CPI of Russia when it is applied to the range-normalized data? leThe actual 2011 CPI for Russia was 2.4488. Which of the predictions made was the most accurate? Why do you think this was

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Project Financing Financial Instruments And Risk Management

Authors: Frank J Fabozzi, Carmel De Nahlik

1st Edition

9811231494, 9789811231490

More Books

Students also viewed these Finance questions