Question 5 Emily walks every day, and she keeps a record of the number of miles she walks each day. The histogram and five-number summary below were created from the recorded miles for a random sample of 25 of the days Emily walked. Frequency 3 4 5 6 7 8 Miles Walked Minimum Q1 Median Q3 Maximum 1.4 2.6 3.25 3.8 7.5 On one of the 25 days in the sample, Emily walked 7.5 miles. From the histogram, it appears that the value 7.5 might be an outlier relative to the other values. Two methods are proposed for identifying an outlier in a set of data. a. One method for identifying an outlier is to use the interquartile range (IQR). An outlier is any number that is greater than the upper quartile by at least 1.5 times the IQR or less than the lower quartile by at least 1.5 times the IQR. Does such a method identify the value of 7.5 miles as an outlier for Emily's set of data? Justify your answer. Another method of identifying an outlier is to investigate whether there is evidence that a value might have come from a population with a mean different from the mean of the population of the other values. Let X and Y represent random variables. X is distributed normally with mean ux and standard deviation s, and Y is distributed normally with mean wy and standard deviation s. Consider 1 randomly selected value of Y and n - 1 randomly selected values of X. b. Consider the difference Y - X: i. In terms of by and ux, what is the mean of the difference Y - X? li. In terms of n and s, what is the standard deviation of the difference Y - X? Suppose that of the n = 25 recorded values from Emily's sample, the value of 7.5 comes from the distribution of Y and the remaining 24 values come from the distribution of X. The summary statistics for the 24 values that come from the distribution of X are given below. n - 1 = 24 x = 3.171 S = 0.821 c. Use the value of the potential outlier and the summary statistics of the remaining 24 values to estimate the mean and standard deviation of the difference Y - X. The estimated mean of Y - X The estimated standard deviation of Y - X Recall that a method for identifying an outlier is to investigate whether there is evidence that a value might have come from a population with a mean different from the mean of the population of the other values. The following hypotheses can be used for such an investigation. Ho: My = Hx Hai My # Hx d. Calculate the value of the test statistic used for evaluating the hypotheses. e. The p-value for the hypothesis test described above is less than 0.0001. What conclusion can be made about the population means, and what conclusion can be made about identifying 7.5 as an outlier? Justify your answers