Question
Assume that human height is normally distributed with mean 170cm and standard deviation 7cm. (a) Simulate a small sample from the human worldwide population by
Assume that human height is normally distributed with mean 170cm and standard deviation 7cm.
(a) Simulate a small sample from the human worldwide population by drawing 100 observations from the underlying Gaussian distribution. Plot the true (underlying) distribution as well as the histogram of your sample (overlay them in the same plot). What is the average height of your sample?
(b) The average calculated in (a) is an estimate of the true mean of the worldwide population.Next, we would like to assess how confident we are that this estimate captures the true worldwide human height.Since we do not have the resources to sample additional individuals, we would artificially sample the worldwide population by resampling (with replacement) our small sample. Doing so, we would simulate a new sample of 100 observations. Each such "new" sample is called a "bootstrap iteration" or a "bootstrap sample."
(i) What is the probability of reselecting the exact same dataset randomly?
(ii) More generally, if we have a sample withnobservations, what is the probability of resampling the exact same dataset afterxiterations?
(iii) Calculate the probability in item (ii) for a dataset with 5 observations and for 1,000 iterations.
(c) Resample your dataset of 100 people heights (with replacement) for 1,000 iterations. Calculate each sample's average (i.e., your statistic of interest) and plot the distribution of averages from all iterations.
(d) Compute the standard deviation and a 95% confidence interval (CI) around your estimate. When calculating the CI, follow the two approaches described in class to obtain two distinct CIs.
(e) Now, suppose we have the resources to generate 1,000 genuine samples of the true underlying distribution. Simulate these samples and repeat items (c) and (d) to obtain a distribution and CIs of the sample mean. Compare the resulting distribution with the bootstrap distribution by overlaying the distribution plots.
(f) Compare the distribution of sample averages from samples of the true underlying distribution (as in (e)) to that of the bootstrap samples (as in (c)), but now with 50 bootstrap iterations instead of 1,000.
(g) Assume a multimodal distribution that is a mixture of the following three Gaussians:N(0, 1),N(10, 1) andN(3, 0.1) with mixing coefficients 0.3, 0.5, and 0.2, respectively. What this means is that the random number is being drawn from the first Gaussian distribution with probability 0.3, from the second Gaussian distribution with probability 0.5, and the third Gaussian distribution with probability 0.2.
(i) Draw 1,000 samples of size 100 from the Gaussian mixture distribution and repeat part (e).
(ii) Choose one of the samples you generated and bootstrap it 1,000 times, as in part (c) (no need to calculate CIs).
(iii) Compare the "true" and bootstrap distributions of the sample mean.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started