Question [100 points] This question will cover statistical inference, the relationship between a population paramater and the sample average. Suppose you toss a coin and are interested in the probability of have a head. Assuming the coin is not rigged, the random variable X defined as 1 if it is a head and 0 if it is a tail is described by a Bernoulli distribution which takes the value 1 (head) with the population probability p = 0.5. 1. [5 points] What is the probability of having a tail? What is the expected value (population mean) of X? What about the population variance? Write down the formula first and then calculate the value. 2. [5 points] Generate a sample of N = 10 observations from randomly drawing from the coin experiment in R, i.e., {X1, X2, .. .; XN). 3. [5 points] Plot the histogram of your sample (hint: package ggplot). 4. [5 points] Estimate the sample average and the sample variance of X. Write down the formula first and then give the estimate. How does the sample average differ from the population average? 5. [10 points] Generate another sample of N = 10 and repeat 3 and 4. Compare the histogram, and the sample average, sample variance. Explain. 6. [10 points] Generate 100 samples of N = 10 observations and for each sample calculate the sample average. Then plot the sample averages (histogram/distribution of sample averages). 7. [10 points] What is the standard deviation of the sample average? What about the sampling average of the sample average? 8. [10 points] What is the formula for the standard deviation as well as the mean of the sample average? 9. [20 points] Repeat 6, 7 and 8 but with a sample of N = 100. 10. [10 points] How does the distribution (look at the histogram) as well as the standard deviation and the mean of the sample average differ? How do they compare to the formula? Explain your results and the lesson learns (Hint: The sampling distribution changes with the number of observations. ) 11. [10 points] Let us use another statistic which is equal to 0.6 independently of the sample used. What is the sampling distribution of this second estimator? What about the sampling uncertainty (captured by the standard deviation of the estimator)? What about the expected value of the statistic? How far is the expected value to the true population paramater? Compare the sampling uncertainty of the sample average to this statistic. Compare the sampling expected value of the sample average and this new statistic. Is the new statistic "better" than the sample average at estimating the true population parameter