Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

HW 9 X hw9 (1).pdf X L X C @ File | C:/Users/petel/Downloads/hw9%20(1).pdf P E hw9 (1).pdf 1 / 1 80% the grader can determine

image text in transcribed
HW 9 X hw9 (1).pdf X L X C @ File | C:/Users/petel/Downloads/hw9%20(1).pdf P E hw9 (1).pdf 1 / 1 80% the grader can determine how you obtained your answer. If you choose to create a Word document, please save it as a PDF before uploading to Canvas. All R code and output should be included in your submission. . NOTE: In this assignment, I have not included reminders for every command you need to use, or every library you need to load. By this point in the semester, you have enough experience to seek out this information. 1. In this problem we will use the tumor data set. (a) Read the data in and remove the first column, which corresponds to the diagnosis. We now have a dataset with only X variables. (b) Conduct a PCA analysis. being sure to center and scale the variables. Provide a plot of the variation explained by each PC. How many principal components do you think should be kept? (Hint: In R pca.out = prcomp(tumor, scale=TRUE) pca.var = pca.out$sdev**2 ove = pca.var/sum(pca.var) plot(pve); plot(cumsum(pve) (c) How much of the total variation in the data is explained by 2 principal components? By 3? (d) Print the principal component loadings, which describe how all of the variables combine to form the PC variables. Interpret the first 2 principal components. (Hint: See Lab 10.4 in the book and he interpretations in the second half of Section 10.2.1 in the ISLR text. In R, pca.out$rotation) (e) Make a biplot of the first 2 principal components and interpret. (Hint: In R, biplot(pca.out, scale=0)) 2. Now, we will use K-means and hierarchical clustering on the data. (a) How many clusters do you think we should specify? Why? (b) Conduct K-means clustering with 2 clusters. What are the cluster means? (Hint, In R, set.seed(1) km.out = kmeans(tumor,2,nstart=20) (c) Make a scatterplot matrix between all variables and color the points by cluster. Describe your findings. (Hint. In R, pairs(tumor, col=km.out$cluster)) (d) Using complete linkage, perform hierarchical clustering and color the points on the scatterplot matrix as in the previous question. Is the result much different? (Hint: In R c.complete=hclust(dist(tumor), method="complete") cutree(hc.complete,2) Type here to search O EV W 35OF 11:25 PM 12/8/2021

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Algebra And Trigonometry Enhanced With Graphing Utilities (Subscription)

Authors: Michael, Michael Sullivan III, Michael III Sullivan, Michael Sullivan 111, III Sullivan

6th Edition

0321849132, 9780321849137

More Books

Students also viewed these Mathematics questions

Question

What is database?

Answered: 1 week ago

Question

What are Mergers ?

Answered: 1 week ago

Question

1. Define and explain culture and its impact on your communication

Answered: 1 week ago