Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The PollutionNom.csv dataset provides age-adjusted mortality rate per 100,000 people for 60 locations. Additional climate and demographic information for each location is available as well.refer

ThePollutionNom.csvdataset provides age-adjusted mortality rate per 100,000 people for 60 locations. Additional climate and demographic information for each location is available as well.refer to the list below for attribute data types and summary attribute descriptions.

Population Data Set Descriptors:

Precipitation: JanuaryF: JulyF: >65: Household: Education: Housing: Density: NonWhite: WhiteCollar: LowIncome: HC:

NOX: SO2: Humidity: Mortality:

Average annual precipitation in inches Average January temperature in degrees F Average July temperature in degrees F % of population aged 65 or older Average household size Median school years completed by those over 22 % of housing units which are sound & with all facilities Population per sq. mile in urbanized areas % non-white population in urbanized areas % employed in white collar occupations % of families with income < $3000 Relative hydrocarbon pollution potential Relative nitric oxides pollution potential Relative sulphur dioxide pollution potential Annual average % relative humidity at 1pm Total age-adjusted mortality rate per 100,000

Use Weka to answer the following questions. (Always use "Use training set" option for testing).

Clustering

1)Perform SimpleKMeans clustering with default parameters (2 clusters). How would you describe

the two clusters based on the attribute characteristics? Interpret how the identified clusters are

different based on average attribute values. Which attributes were more important to differentiate

the clusters?

2)Perform SimpleKMeans clustering with three clusters. How would you describe the three clusters

based on the attribute characteristics? Discuss which subsets of the population each cluster

represents.

Neural Networks

1)Perform neural network analysis (MultilayerPerceptron) with two hidden layers ("hiddenLayers"=2).What is the overall prediction accuracy? Identify the attributes that significantly impact each of the two hidden nodes. How would you characterize these two hidden factors identified by the neural network analysis?

2)Repeat the same analysis with three hidden layers. What is the new prediction accuracy? Interpret the confusion matrix. Why do you think the accuracy is different? Identify the attributes that significantly impact each of the three hidden nodes. How would you characterize these three

hidden factors identified by the neural network analysis?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Entrepreneurship

Authors: Andrew Zacharakis, William D Bygrave

5th Edition

1119563097, 9781119563099

More Books

Students also viewed these General Management questions

Question

3. What is my goal?

Answered: 1 week ago

Question

2. I try to be as logical as possible

Answered: 1 week ago