Question
CLUSTERING You have collected a significant number of tweets from a particular geographic region of interest and are interested in developing an algorithm to illuminate
CLUSTERING
You have collected a significant number of tweets from a particular geographic region of interest and are interested in developing an algorithm to illuminate differences that may exist within your data set. You know that some of the tweets are from known professional football (soccer) players, and you hope to demonstrate that a clustering technique can be used to highlight these differences within the larger data set. Youve engineered quantifiable features from your data, which you intend to use to build a supervised clustering algorithm.
User ID | Feature 1 | Feature 2 | Feature 3 | Footballer(1=YES, 0=NO) |
001 | 8 | 22 | 62 | 1 |
002 | 15 | 51 | 85 | 0 |
003 | 9 | 44 | 121 | 0 |
004 | 8 | 51 | 136 | 0 |
005 | 8 | 20 | 93 | 1 |
006 | 15 | 64 | 124 | 0 |
007 | 14 | 56 | 101 | 0 |
008 | 5 | 10 | 80 | 1 |
009 | 5 | 18 | 73 | 1 |
0010 | 9 | 26 | 79 | 1 |
Consider the above data set. You have determined the three features that you believe have the greatest correlation with football status.
a)Perform one iteration of k-means clustering on the above data set. Show all work. Use the Centroid coordinates (10, 20, 80)and (10, 50, 110), corresponding to (Feature 1, Feature2, Feature 3),as your initial best guess clusters.
b) How well did your algorithm cluster military personnel verses non-football players? Construct a confusion matrix and calculate the Matthews Correlation Coefficient.
c) You selected three features to use in this computation because you determined that they are the three most correlated features with football status. While adding additional features up to a certain point will enhance clustering model accuracy, adding too many features diminishes accuracy. Explain why this is true.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started