2 Make a simulation in R that shows the distribution of the t test statistic when the null hypothesis is true For start , use a for loop that repeatedly performs t tests comparing sample means of data that come from distributions with the same population mean and standard deviation Use rnorm() to take samples, t test() to perform the t tests, and use $statistic to extract the t test statistic from the t test() procedure (e g t test(x,y)$statistic) Make a histogram of the test statistics If you need help, look back at the notes on for loops One assumption of the t test is that the populations you sample from have the same standard deviation Violating this assumption can affect the distribution of the t test statistic This is especially the case when sample sizes are unequal Re do the simulation from 2, but this time sample from normal distributions with the same mean but where one has a standard deviation of 1 and a sample size of 20, and the other has a standard deviation of 5 and a sample size of 100 Plot a histogram of the test statistics How does this differ from the histogram in part 2 Perform the procedure in part a above, but this time use the pooled variance t test To start this, add var equal TRUE as an argument in the t test function Plot a histogram of the test statistics How does this differ from the histogram in part a above In Question 3, you ran a simulation to investigate how violating the assumption of equal variances can affect the properties of a t test In this case, run a simulation to investigate how violating the assumption of normally distributed data can affect the properties of a t test The gamma distribution is skewed to the right It contains a parameter called shape The R function for generating data from a gamma distribution is rgamma you can read the details in R help Make three historgrams, each of a sample of size n 10,000 drawn from a gamma distribution, with shape 1, shape 0 5, and shape 0 1 Use breaks 100 to force each histogram to have lots of bars Describe what you see happening as the shape parameter gets smaller make a simulation that repeatedly draws two samples from a gamma distribution with shape 1, then compares their means using a t test For this simulation, use n 30 for the size of each sample Write code that will save both the t test statistic and p value each time Then make a histogram of the test statistics, and report the proportion of p values less than 0 05 Note that, if the assumptions of the t test are not violated, the p value should be less than 0 05 5 of the time Do the same thing in part b two more times, using shape 0 5 and shape 0 1 Does this seem to have any effect on the distribution of the test statistics, or the proportion of p values less than 0 05 Run the simulation three more times (once for each value of shape), using samples of size n 10 rather than n 30 Show the three histograms and three proportions of p values less than 0 05 Did this have any noticeable effect on the results only Question 4 needed, no database required, be sure using Rstudio

The Answer is in the image, click to view ...

Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Nov 07, 2024

2. Make a simulation in R that shows the distribution of the t-test statistic when the null hypothesis is true. For start , use a

2. Make a simulation in R that shows the distribution of the t-test statistic when the null hypothesis is true. For start , use a for loop that repeatedly performs t-tests comparing sample means of data that come from distributions with the same population mean and standard deviation. Use rnorm() to take samples, t.test() to perform the t-tests, and use "$statistic" to extract the t-test

statistic from the t.test() procedure (e.g. t.test(x,y)$statistic). Make a histogram of the test statistics. If you need help, look back at the notes on for loops.

One assumption of the t-test is that the populations you sample from have the same standard deviation. Violating this assumption can affect the distribution of the t-test statistic. This is especially the case when sample sizes are unequal.
Re-do the simulation from 2, but this time sample from normal distributions with the same mean but where one has a standard deviation of 1 and a sample size of 20, and the other has a standard deviation of 5 and a sample size of 100. Plot a histogram of the test statistics. How does this differ from the histogram in part 2?
Perform the procedure in part a. above, but this time use the "pooled variance" t-test. To start this, add "var.equal=TRUE" as an argument in the t.test function. Plot a histogram of the test statistics. How does this differ from the histogram in part a. above?
In Question 3, you ran a simulation to investigate how violating the assumption of equal variances can affect the properties of a t-test. In this case, run a simulation to investigate how violating the assumption of normally distributed data can affect the properties of a t-test.
The gamma distribution is skewed to the right. It contains a parameter called "shape". The R function for generating data from a gamma distribution is rgamma - you can read the details in R help.
Make three historgrams, each of a sample of size n = 10,000 drawn from a gamma distribution, with shape = 1, shape = 0.5, and shape = 0.1. Use "breaks = 100" to force each histogram to have lots of bars. Describe what you see happening as the shape parameter gets smaller.
make a simulation that repeatedly draws two samples from a gamma distribution with shape = 1, then compares their means using a t-test. For this simulation, use n = 30 for the size of each sample. Write code that will save both the t-test statistic and p-value each time. Then make a histogram of the test statistics, and report the proportion of p-values less than 0.05. Note that, if the assumptions of the t-test are not violated, the p-value should be less than 0.05 5% of the time.
Do the same thing in part b. two more times, using shape = 0.5 and shape = 0.1. Does this seem to have any effect on the distribution of the test statistics, or the proportion of p- values less than 0.05?
Run the simulation three more times (once for each value of shape), using samples of size n = 10 rather than n = 30. Show the three histograms and three proportions of p-values less than 0.05. Did this have any noticeable effect on the results?

only Question 4 needed, no database required, be sure using Rstudio