Answered step by step
Verified Expert Solution
Question
1 Approved Answer
2 Explore Central Limit Theorem (50pt) In this section you will see how does Central Limit Theorem (CLT) work. CLT states two things: a) The
2 Explore Central Limit Theorem (50pt) In this section you will see how does Central Limit Theorem (CLT) work. CLT states two things: a) The means of a sample of random numbers tend to be normally distributed if the sample gets large. b) Variance of the mean tends to be 5 Var X where S is the sample size and X is the random variable we are analyzing. This is actually a property of expectation and independence, not really CLT. But CLT is closely related to this result.) CLT, and how variance and mean value change when sample size increases, plays a very important role in computing confidence intervals later. The task is structured in a way that you may want to create a function that takes in sample size S and outputs all needed results, including the histogram. There will be quite a bit of repetitive coding otherwise. We start with a distribution that does not look at all normal. We create a RV X with probability 0.5 with probability 0.5. You can imagine we flip a fair coin and label heads as I and tails as -1.) One way to sampleWe start with a distribution that does not look at all normal. We create a R.V with probability 0.5 with probability 0.5. (You can imagine we flip a fair coin and label heads as 1 and tails as -1.) One way to sample from such RV is something like this import numpy as np np. random . randint (0, 2, size-10) +2 - 1 ## array ([ 1, 1, -1, -1, 1, -1, -1, -1, -1, -1]) Detailed tasks: 1. (7pt) Calculate the expected value and variance of this random variable. Note: these are theoretical values and not related to any samples. If you use functions like mean or var here then you have misunderstood the concepts! Hint: read lecture notes 1.3.4 (Expected Value and Variance), and Openintro Statis- tics 3.4 (Random variables), in particular 3.4.2 (Variability). I recommend to use the shortcut formula Var X - EX - (EX)2. 2. (1pt) Choose your number of repetitions R. 1000 is a good number but you can also take 10,000 or 100,000 to get smoother histograms. Note: number of repetitions R is not the same as sample size S here. You will create samples of size S for R times below. For instance, you will create R = 1000 times a sample of size S = 5. Please understand the difference, it is a fequent source of confusion! 3. (5pt) Create a vector of R random realizations of X. Make a histogram of those. Com- ment the shape of the histogram. Note: in this case we have R = 1000 repetitions and samples are of size S = 1 as we look at individual realizations. Hint: it takes some tweaking to get nice histograms of discrete distributions. The simplest way is just to make many bars (most of which will be 0) by adding argument bins=100 to plt . hist. 4. (2pt) Compute and report mean and variance of the sample you created (just use np . mean and np. var). NB! Here we talk about sample mean and sample variance. Compare these numbers with the theoretical values computed at question 1 above. Hint: they should be fairly close.5. (5pt) Now create R pairs of random realizations of X (i.e. sample size S = 2). For each pair, compute its mean. You should have R mean values. Make the histogram. How does this look like? 6. (6pt) Compute and report mean of the R pair means, and variance of the means. NB! we talk about sample mean and sample variance again, where sample is your sample of R pair means. 7. (4pt) Compute the expected value and variance of the pair means, i.e. the theoretical concepts. This mirrors what you did in 1. Compare the theoretical values with the sample values above. Are those fairly similar? Note that according to CLT, the variance of a pair mean should be just 1/2 of what you got above as for pairs S - 2. 8. (4pt) Now instead of pairs of random numbers, repeat this with 5-tuples of random numbers (i.e. S = 5 random numbers per one repetition, and still the same R = 1000 or whatever you chose repetitions in total). Compare the theoretical and sample version of mean and variance of 5-tuples. Are they similar? Do you spot any noticeable differences in the histogram compared to your previous histogram? 9. (3pt) Repeat with 25-tuples... (Also compute the expectation and theoretical variance, and compare those with sample mean, sample variance) 10. (3pt) ... and with 1000-tuples. Do not forget to compare with theoretical results. 11. (2pt) Comment on the tuple size, and how the shape of the histogram changes when the tuple size increases. 12. (6pt) Explain why do the histograms resemble normal distribution as $ grows. In particular, explain what happens when we move from single values $ - 1 to pairs S - 2. Why did two equal peaks turn into a "III"-shaped histogram? 13. (2pt) Explain what is the difference between R and S. How do changing these values affect the histograms
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started