Question
In this assignment, we will examine the Pokemon and Credit card datasets described in the Dataset Descriptions document under the Datasets section of the unit
In this assignment, we will examine the "Pokemon" and "Credit card" datasets described in the "Dataset Descriptions" document under the "Datasets" section of the unit webpage. Before beginning the assignment, carefully read through the descriptions of these datasets and their variables. Note that the file CreditCard.csv is large (150 MB) compared to previous datasets, so it is strongly recommended that you save this file locally and read from a file path (rather than from a URL). It is also strongly recommended that, for analyses involving cross-validation applied to this dataset, you initially use a small number of repetitions to test your code before applying the requested number of replications. 1. [21 marks] Consider the "Pokemon" dataset, which includes a variety of information on Pok emon ap- pearing in the Pokemon series of video games. We will restrict our focus to key statistics compiled for these Pokemon that relate to their effectiveness in battle. Carry out a principal component analysis consisting of the variables: HIT POINTS ATTACK SPECIAL ATTACK DEFENSE SPECIAL DEFENSE SPEED TOTAL (Note that you are not expected to assess the assumptions underlying principal component analysis.) (a) Consider the eigenvalues for the principal component analysis. i. How many principal components would you select if using the "elbow" method? (2 marks) ii. How many principal components would you select if attempting to account for 90% of total variation? (2 marks) iii. How many principal components would you select if using 1 as a cut-off? (2 marks) (Be sure to provide clear evidence to support your answers.) (b) What is the value for the last eigenvalue? Thinking carefully about the variables used in the principal component analysis, why does it make sense that the last eigenvalue would have this value? (3 marks) (c) Produce a biplot of the third and fourth principal components. What variables load similarly onto these principal components? (3 marks) (d) What two variables load most heavily on the second principal component, and what are their percentage contributions to that principal component? (3 marks) Now consider a linear discriminant analysis that attempts to identify the type of the Pokemon (as measured in the variable TYPE 1) using HIT POINTS ATTACK SPECIAL ATTACK DEFENSE SPECIAL DEFENSE SPEED (e) What percentage of the separation achieved between the different types of Pokemon is attributable to the first linear discriminant function? (2 marks) (f) What is the first discriminant function? (2 marks)(g) What is the hit rate? (2 marks)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started