Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. In order to derive required drug dosages Milner and Rougier (2014) recorded a number of variables on a cohort of 544 Kenyan donkeys. The

1. In order to derive required drug dosages Milner and Rougier (2014) recorded a number of variables on a cohort of 544 Kenyan donkeys. The data are available on Brightspace as Donkeys.csv.

Variable Measurement scale

Girth cm

Height cm

Length cm

Weight kg

Age < 2, 2-5, 5-10, 10-15, 15-20, >20 (years)

BCS Body condition score: 1 (emaciated) to 3 (healthy) to 5 (obese) in steps of 0.5.

Sex Female, gelding, stallion.

(a) Use suitable plots to illustrate each of the variables in the donkey data set. Comment on the distributions of the variables, and on the outlying donkey. Remove the outlying donkey from the dataset.

(b) A principal components analysis is employed to reduce the dimension of the donkey data. Which variables should be used in such an analysis?

(c) Would you advise performing principal components analysis on the correlation matrix or covariance matrix of the appropriate donkey variables? Explain you reasoning

(d) Write your own function that could be used to apply principal components analysis to a multivariate data set. You should not use any inbuilt PCA functions that are available in R, but should derive the method from first principles and write your own code to implement the method accordingly. Your function should output objects that would be of interest to someone using your function.

(e) Set the seed in R to your student number. Randomly sample 5 values between 1 and 500, and remove the corresponding donkeys from the data set. All subsequent analyses in question 1 should be conducted on this version of the dataset. From the output of the application of your own PCA function to the appropriate variables, how many principal components are required to summarise the donkey data? Use suitable plot(s) to motivate your decision.

(f) Interpret the first column of the loadings matrix resulting from the application of your PCA function to the appropriate variables from the modified donkeys data (from 1(e)).

(g) Plot the first principal component scores of the donkeys resulting from the application of your PCA function to the appropriate variables from the modified donkeys data (from 1(e)). Why is such a plot useful in PCA? Comment on the principal component scores in the context of the available data.

(h) The jackknife is one method that could be used to validate the principal components solution. Detail in your own words how the method works. Write your own code to implement the method. Use your code to validate the results obtained from applying your PCA function to the appropriate variables from the modified donkeys data (from 1(e))

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Cohomological Aspects In Complex Non-Kähler Geometry

Authors: Daniele Angella

1st Edition

3319024418, 9783319024417

More Books

Students also viewed these Mathematics questions

Question

How will emerging technology impact your business and your BCP?

Answered: 1 week ago