Question
2. Make a simulation in R that shows the distribution of the t-test statistic when the null hypothesis is true. For start , use a
2. Make a simulation in R that shows the distribution of the t-test statistic when the null hypothesis is true. For start , use a for loop that repeatedly performs t-tests comparing sample means of data that come from distributions with the same population mean and standard deviation. Use rnorm() to take samples, t.test() to perform the t-tests, and use "$statistic" to extract the t-test
statistic from the t.test() procedure (e.g. t.test(x,y)$statistic). Make a histogram of the test statistics. If you need help, look back at the notes on for loops.
- One assumption of the t-test is that the populations you sample from have the same standard deviation. Violating this assumption can affect the distribution of the t-test statistic. This is especially the case when sample sizes are unequal.
- Re-do the simulation from 2, but this time sample from normal distributions with the same mean but where one has a standard deviation of 1 and a sample size of 20, and the other has a standard deviation of 5 and a sample size of 100. Plot a histogram of the test statistics. How does this differ from the histogram in part 2?
- Perform the procedure in part a. above, but this time use the "pooled variance" t-test. To start this, add "var.equal=TRUE" as an argument in the t.test function. Plot a histogram of the test statistics. How does this differ from the histogram in part a. above?
- In Question 3, you ran a simulation to investigate how violating the assumption of equal variances can affect the properties of a t-test. In this case, run a simulation to investigate how violating the assumption of normally distributed data can affect the properties of a t-test.
- The gamma distribution is skewed to the right. It contains a parameter called "shape". The R function for generating data from a gamma distribution is rgamma - you can read the details in R help.
- Make three historgrams, each of a sample of size n = 10,000 drawn from a gamma distribution, with shape = 1, shape = 0.5, and shape = 0.1. Use "breaks = 100" to force each histogram to have lots of bars. Describe what you see happening as the shape parameter gets smaller.
- make a simulation that repeatedly draws two samples from a gamma distribution with shape = 1, then compares their means using a t-test. For this simulation, use n = 30 for the size of each sample. Write code that will save both the t-test statistic and p-value each time. Then make a histogram of the test statistics, and report the proportion of p-values less than 0.05. Note that, if the assumptions of the t-test are not violated, the p-value should be less than 0.05 5% of the time.
- Do the same thing in part b. two more times, using shape = 0.5 and shape = 0.1. Does this seem to have any effect on the distribution of the test statistics, or the proportion of p- values less than 0.05?
- Run the simulation three more times (once for each value of shape), using samples of size n = 10 rather than n = 30. Show the three histograms and three proportions of p-values less than 0.05. Did this have any noticeable effect on the results?
only Question 4 needed, no database required, be sure using Rstudio
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started