Answered step by step
Verified Expert Solution
Question
1 Approved Answer
HW 9 X hw9 (1).pdf X L X C @ File | C:/Users/petel/Downloads/hw9%20(1).pdf P E hw9 (1).pdf 1 / 1 80% the grader can determine
HW 9 X hw9 (1).pdf X L X C @ File | C:/Users/petel/Downloads/hw9%20(1).pdf P E hw9 (1).pdf 1 / 1 80% the grader can determine how you obtained your answer. If you choose to create a Word document, please save it as a PDF before uploading to Canvas. All R code and output should be included in your submission. . NOTE: In this assignment, I have not included reminders for every command you need to use, or every library you need to load. By this point in the semester, you have enough experience to seek out this information. 1. In this problem we will use the tumor data set. (a) Read the data in and remove the first column, which corresponds to the diagnosis. We now have a dataset with only X variables. (b) Conduct a PCA analysis. being sure to center and scale the variables. Provide a plot of the variation explained by each PC. How many principal components do you think should be kept? (Hint: In R pca.out = prcomp(tumor, scale=TRUE) pca.var = pca.out$sdev**2 ove = pca.var/sum(pca.var) plot(pve); plot(cumsum(pve) (c) How much of the total variation in the data is explained by 2 principal components? By 3? (d) Print the principal component loadings, which describe how all of the variables combine to form the PC variables. Interpret the first 2 principal components. (Hint: See Lab 10.4 in the book and he interpretations in the second half of Section 10.2.1 in the ISLR text. In R, pca.out$rotation) (e) Make a biplot of the first 2 principal components and interpret. (Hint: In R, biplot(pca.out, scale=0)) 2. Now, we will use K-means and hierarchical clustering on the data. (a) How many clusters do you think we should specify? Why? (b) Conduct K-means clustering with 2 clusters. What are the cluster means? (Hint, In R, set.seed(1) km.out = kmeans(tumor,2,nstart=20) (c) Make a scatterplot matrix between all variables and color the points by cluster. Describe your findings. (Hint. In R, pairs(tumor, col=km.out$cluster)) (d) Using complete linkage, perform hierarchical clustering and color the points on the scatterplot matrix as in the previous question. Is the result much different? (Hint: In R c.complete=hclust(dist(tumor), method="complete") cutree(hc.complete,2) Type here to search O EV W 35OF 11:25 PM 12/8/2021
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started