Predictive Analytics Question (BS-BA-309) Q3a. Identify the nominal and ordinal variables in the below dataset. Justify your answer. (4marks) Public Number of Number of New Percent of Stud College or applications applications student faculty faculty Graduation Name State Ranking Private received accepted enrolled with PhD ratio rate A AK low Private 193 146 55 76 11.9 15 B AK high Public 1852 1427 928 67 10 C AK low Public 146 117 89 39 9.5 39 D AK medium Public 206 1598 1162 48 13.7 E AL high Public 2817 1920 984 53 14.3 40 AL high Private 345 320 179 52 32.8 55 G AL medium Public 1351 892 570 72 18.9 51 Q3b. Considering only two variables (Percent of faculty with PhD and Graduation rate) from the dataset in Q3a, we get the covariance matrix (the diagonal elements are the variances of the variables) as follows: percent. of. faculty. withPHD Graduation. rate Percent. of . faculty . withPHD 304 91 Graduation . rate 91 357 What is the percentage of variance explained by each of the two variables? How much variation will we lose if we drop the variable Percent of faculty with PhD to reduce the dimension from 2D to 1D? Clearly show the calculation steps (4 marks) Q3c. We performed Principal Component Analysis (PCA) without normalization on the two variables from Q3b and got the following results: Importance of components: PC1 PC2 Standard deviation 20.615 15. 370 Proportion of Variance 0. 643 0. 357 Cumulative Proportion 0. 643 1. 000 Weightings PC1 PC2 Percent. of. faculty. withPHD 0.6 0.8 Graduation . rate 0.8 -0.6 i. Comment on how PCA has done a better job in data reduction (from 2D to 1D) than dropping any one of the original variables. (3 marks) ii. Calculate the value of the first and third observation (i.e., for College A and C) corresponding to PC1 (Given average Percent of faculty with PhD = 69.65 and average Graduation rate = 60.60). Clearly show the calculation steps. (5