Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Step 1 : Use the code from Week 7 as a Starting Point In this assignment, we will not be doing all the analysis as
Step : Use the code from Week as a Starting Point
In this assignment, we will not be doing all the analysis as before. But much of the code from week can be used as a starting point for this assignment. For this assignment, do not be concerned with splitting data into training and test sets. In the real world, you would do that. But for this exercise, it would only be an unnecessary complication.
Step : PCA Analysis
Use only the input variables. Do not use either of the target variables.
Use only the continuous variables. Do not use any of the flag variables.
Select at least of the continuous variables. It would be preferable if there were a theme to the variables selected.
Do a Principal Component Analysis PCA on the continuous variables.
Display the Scree Plot of the PCA analysis.
Using the Scree Plot, determine how many Principal Components you wish to use. Note, you must use at least two. You may decide to use more. Justify your decision. Note that there is no wrong answer. You will be graded on your reasoning, not your decision.
Print the weights of the Principal Components. Use the weights to tell a story on what the Principal Components represent.
Perform a scatter plot using the first two Principal Components. Do not color the dots. Leave them black.
Step : Cluster Analysis Find the Number of Clusters
Use the principal components from Step for this step.
Using the methods presented in the lectures, complete a KMeans cluster analysis for N to at least N Feel free to take the number higher.
Print a scree plot of the clusters and determine how many clusters would be optimum. Justify your decision.
Step : Cluster Analysis
Using the number of clusters from step perform a cluster analysis using the principle components from Step
Print the number of records in each cluster.
Print the cluster center points for each cluster
Convert the KMeans clusters into "flexclust" clusters
Print the barplot of the cluster. Describe the clusters from the barplot.
Score the training data using the flexclust clusters. In other words, determine which cluster they are in
Perform a scatter plot using the first two Principal Components. Color the plot by the cluster membership.
Add a legend to the plot.
Determine if the clusters predict loan default.
Step : Describe the Clusters Using Decision Trees
Using the original data from Step predict cluster membership using a Decision Tree
Display the Decision Tree
Using the Decision Tree plot, describe or tell a story of each cluster. Comment on whether the clusters make sense.
Step : Comment
Discuss how you might use these clusters in a corporate setting.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started