A British statistician and biologist introduced a data set which consists of 50 samples from each of
No answer yet for this question.
Ask a Tutor
Question:
A British statistician and biologist introduced a data set which consists of 50 samples from each of two species of Iris (Iris setosa and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimetres. This data set is available as Excel file "iris.xlsx" on Moodle in "Assessment" block.
- a)Produce a histogram of the sepal length of Iris setosa and comment on the features of the distribution as shown by the histogram.
- b)Determine the five-number summary for the sepal widths of Iris setosa.
- c)Determine the five-number summary for the sepal widths of Iris versicolor.
- d)Draw a comparative box-plot for the sepal widths of the two species, ensuring that any outliers are noted and that all axes are well labelled.
- e)Comment on any similarities and differences (as shown in the comparative box-plot) in the sepal widths between the two species.
- f)Find the mean and standard deviation of the sepal width for each type of Iris in the samples.
- g)Compare the mean with the median for each type of Iris. What does each of these comparisons indicate about each distribution?
Combining groups can change the correlation coefficient, R-squared and overall trend between variables.
- h)Draw a scatter plot and find the correlation coefficient and R-squared
- for the sepal width and sepal length of Iris setosa,
- for the sepal width and sepal length of Iris versicolor, and
- for the sepal width and sepal length of the combined samples of Iris setosa and Iris versicolor.
- i)State the effect of combining samples of Iris setosa and Iris versicolor on the correlation between the sepal width and sepal length.
Posted Date: