Question
protein casein scc conc_fed 3.91 3.05 24 3 3.82 3 45 3 4.03 3.18 44 3 3.91 3.11 29 3 4 3.14 208 3 4.04
protein | casein | scc | conc_fed |
3.91 | 3.05 | 24 | 3 |
3.82 | 3 | 45 | 3 |
4.03 | 3.18 | 44 | 3 |
3.91 | 3.11 | 29 | 3 |
4 | 3.14 | 208 | 3 |
4.04 | 3.19 | 95 | 2 |
4.11 | 3.19 | 71 | 2 |
4.98 | 4 | 14 | 2 |
3.76 | 2.89 | 37 | 2 |
4.35 | 3.4 | 27 | 2 |
3.77 | 2.99 | 78 | 0 |
3.7 | 2.88 | 27 | 3 |
3.64 | 2.78 | 81 | 2 |
3.41 | 2.64 | 654 | 3 |
3.87 | 3.01 | 45 | 2 |
4.38 | 3.43 | 283 | 2 |
3.94 | 3.08 | 31 | 2 |
4.18 | 3.16 | 176 | 2 |
4.62 | 3.64 | 72 | 2 |
3.28 | 2.5 | 124 | 2 |
4.14 | 3.25 | 26 | 0 |
4.14 | 3.29 | 11 | 2 |
3.88 | 2.99 | 75 | 2 |
4.26 | 3.33 | 64 | 2 |
4.25 | 3.36 | 62 | 2 |
4.17 | 3.24 | 93 | 2 |
3.82 | 3 | 20 | 0 |
4.3 | 3.34 | 27 | 2 |
3.72 | 2.91 | 13 | 2 |
4 | 3.1 | 563 | 2 |
3.88 | 2.99 | 762 | 2 |
3.79 | 2.97 | 15 | 0 |
4.12 | 3.2 | 36 | 2 |
3.54 | 2.7 | 72 | 0 |
Exploratory Data Analysis (35 marks):
For each question in the EDA section please provide the lines of R code required to produce your results.
a) Using a boxplot, histogram and the descriptive statistics (mean, min, max, median, and quantiles).
Describe the distribution of the somatic cell count scc. (5 marks)
b) Using a boxplot, histogram and the descriptive statistics (mean, min, max, median, and quantiles).
Describe the distribution of the log of the somatic cell counts scc. (5 marks)
c). Using a boxplot, histogram and the descriptive statistics (mean, min, max, median, and quantiles).
Describe the distribution of the protein levels protein. (5 marks)
d). Using a boxplot, histogram and the descriptive statistics (mean, min, max, median, and quantiles).
Describe the distribution of the casein levels casein. (5 marks)
e). Convert the categorical variable conc_fed to a factor. Describe and illustrate the frequency and
proportions of the categorical variable concentrate feed conc_fed (5 marks)
f). Using the descriptive statistics (mean, standard deviation, median, mad: median absolute deviation
(from the median), minimum, maximum, skew and standard error) and a boxplot describe how the log
of somatic cell counts scc varies with respect to the variable concentrate feed conc_fed (5 marks)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started