Question

1 Approved Answer

Posted on Oct 14, 2024

STAT22000 Autumn 2017 Homework 7 All page, section, and exercise numbers below refer to the course text (OpenIntro Statistics, 3rd edition, by Diez, Barr, and

STAT22000 Autumn 2017 Homework 7 All page, section, and exercise numbers below refer to the course text (OpenIntro Statistics, 3rd edition, by Diez, Barr, and Cetinkaya-Rundel.). Reading: Section 5.1, 5.2, 5.3 (Skip 5.4 and 5.5) Problems for Self-Study : (Do Not Turn In) Exercise 5.1, 5.3, 5.7, 5.11, 5.17, 5.19, 5.21, 5.25, 5.35 on p.260-267 Answers can be found at the end of the book. Problems to Turn In: due midnight of Friday, May 19, on Canvas. 1. A study compared different psychological therapies for teenage girls suffering from anorexia, an eating disorder that causes them to become dangerously underweight. Each girl's weight was measured before and after a period of therapy. The variable of interest was the weight change, defined as weight at the end of the study minus weight at the beginning of the study. In this study, 29 girls received cognitive behavioral therapy. This form of psychotherapy stresses identifying the thinking that causes the undesirable behavior and replacing it with thoughts designed to help improve this behavior. Their changes in weight (lb) during the study were 1.7, 11.7, -1.4, 0.7, 6.1, -0.8, -0.1, 1.1, 2.4, -0.7, -4.0, 12.6, -3.5, 20.9, 1.9, 14.9, -9.3, 3.9, 3.5, 2.1, 0.1, 17.1, 1.4, 15.4, -7.6, -0.3, -0.7,1.6, -3.7 The weight change was positive if the girl gained weight and negative if she lost weight. (a) Using a calculator or R, find the sample mean and the standard deviation of the weight changes. (b) To know whether the therapy is effective, conduct a hypothesis test for the hypothesis H0 : = 0 against Ha : 6= 0. Report the t-statistic with degrees of freedom, and the P -value. (c) Find the 95% confidence interval for , where is the population mean change in weight during the study. Based on the constructed interval, explain why it suggests that the true mean change in weight is positive, but possibly quite small. (d) Verify your computation in part (b) and (c) with the R commend t.test. wtgain = c(1.7,11.7,-1.4,0.7,6.1,-0.8,-0.1,1.1,2.4,-0.7,-4.0,12.6,-3.5,20.9, 1.9,14.9,-9.3,3.9,3.5,2.1,0.1,17.1,1.4,15.4,-7.6,-0.3,-0.7,1.6,-3.7) t.test(wtgain) (e) Plot the data with a histogram. Comment on the shape of the histogram. Do you have concern about whether the conclusion from the test and confidence interval in part (b) and (c) are appropriate? 2. Are any physiological indicators associated with schizophrenia? Early studies, based largely on postmortem analysis, suggest that the sizes of certain areas of the brain may be different in persons afflicted with schizophrenia than in others. Confounding variables in these studies, however, clouded the issue considerably. In a 1990 article, researchers reported the results of a study that controlled for genetic and socioeconomic differences by examining 15 pairs of identical twins, where one of the twins was schizophrenic and the other was not. The twins were located through an intensive search throughout Canada and the United States1 . The researchers used magnetic resonance imaging (MRI) to measure the volumes (in cm3 ) 1 Data from R. L. Suddath et al., 'Anatomical Abnormalities in the Brains of Monozygotic Twins Discordant for Schizophrenia,' New England Journal of Medicine 322(12) (1990): 789-93. 1 Unaffected 1.94 1.44 1.56 1.58 2.06 1.66 1.75 1.77 1.78 1.92 1.25 1.93 2.04 1.62 2.08 Affected 1.27 1.63 1.47 1.39 1.93 1.26 1.71 1.67 1.28 1.85 1.02 1.34 2.02 1.59 1.97 diff 0.67 0.19 0.09 0.19 0.13 0.40 0.04 0.10 0.50 0.07 0.23 0.59 0.02 0.03 0.11 of several regions and subregions inside the twins' brains. The table presents data based on the reported summary statistics from one subregion, the left hippocampus. (a) Test the null hypothesis that there is no difference in volumes of the left hippocampus between the unaffected and the affected individuals. Be sure to specify the null and alternative hypotheses, the test statistic with degrees of freedom, and the P -value. What do you conclude using the 0.05 significance level? (b) Construct a 95% confidence interval for the mean difference in volumes of the left hippocampus between the unaffected and the affected individuals. (c) Check your computation in (a) and (b) with the R commands below. unaffected = c(1.94,1.44,1.56,1.58,2.06,1.66,1.75,1.77,1.78,1.92,1.25,1.93,2.04,1.62,2.08) affected = c(1.27,1.63,1.47,1.39,1.93,1.26,1.71,1.67,1.28,1.85,1.02,1.34,2.02,1.59,1.97) t.test(unaffected, affected, paired=T) 3. Traditional brand research argues that successful logos are ones that are highly relevant to the product they represent. However, a market research firm recently reported that nearly 20% of all table wine brands introduced in the last three years feature an animal on the label. Since animals have little to do with the product, why are marketers using this tactic? Some researchers have proposed that consumers who are \"primed\" (In other words, they've thought about the image earlier in an unrelated context) process visual information more easily. To demonstrate this, the researchers randomly assigned participants to either a primed or non-primed group. Each participant was asked to indicate their attitude toward a product on a seven-point scale (from 1 = dislike very much to 7 = like very much). A bottle of MagicCoat pet shampoo, with a picture of a collie on the label, was the product. Prior to giving this score, however, participants were asked to do a word find where four of the words were common across groups (pet, grooming, bottle, label) and four were either related to the image (dog, collie, puppy, woof) or image conflicting (cat, feline, kitten, meow). The following table contains the responses listed from smallest to largest. Primed Non Primed Group Brand Attitude 2233344444444445555555 11223333333333334445 (a) Make a histogram for the scores of each group. Is it appropriate to use the two-sample t-tests and t-intervals? Explain your answer. (b) Test whether these two groups show the same preference for this product. Use a two-sided alternative hypothesis and a significance level of 5% (without assuming of equal population SDs). (c) Construct a 95% confidence interval for the difference in average preference. (d) Write a short summary of your conclusions. 4. An experiment was carried out in southern Florida between 1968 and 1972 to test a hypothesis that massive injection of silver iodide into cumulus clouds can lead to increased rainfall2 . 2 Data from J. Simpson, A. Olsen, and J. Eden, A \"Bayesian Analysis of a Multiplicative Treatment Effect in Weather Modification,\" Technometrics 17 (1975): 161-66. 2 On each of 52 days that were deemed suitable for cloud seeding, a random mechanism was used to decide whether to seed the target cloud on that day or to leave it unseeded as a control. An airplane flew through the cloud in both cases, since the experimenters and the pilot were themselves unaware of whether on any particular day the seeding mechanism in the plane was loaded or not (that is, they were blind to the treatment). Precipitation was measured as the total rain volume in acre-feet falling from the cloud base following the airplane seeding run, as measured by radar. The data file is rainfalls.txt. (a) Load the data set to R and make a side-by-side boxplot of the rain volumes. Compare the center and spread and comment on the skewness of the two boxplots. Is it appropriate to compare the rain volumes of the two groups using a two-sample t test? rain = read.table("rainfalls.txt", h=T) library(mosaic) bwplot(Rainfall~Treatment, data=rain) (b) Make a side-by-side boxplot of the log of the rain volumes with the command bwplot(log(Rainfall)~Treatment, data=rain) Compare the center and spread and comment on the skewness of the two boxplots. Is it appropriate to compare the log rain volumes of the two groups using a two-sample t test? (c) Conduct two-sample t test on the mean log rain volumes of the two groups assuming the equality of the two population SDs. Report the t-statistic with degrees of freedom, and give a range of the two-sided p-value. The summary statistic of the data can be obtained by the following command favstats(log(Rainfall)~Treatment, data=rain) (d) Construct a 99% confidence interval for the mean change in log rain volume when a cloud is seeded and when it is left unseeded. Remark: A problem after log-transformation is, how to interpret the mean change in log rain volume, How to describe of cloud-seeding on rain volume rather than on log rain volume. Recall that log(a) log(b) = log(a/b). Thus, if the confidence interval in (d) is (L, U ), we can exponentiate the two endpoints (eL , eU ) to obtain a confidence interval for the ratio between the two means. We can describe the effect of cloudseeding as the volume of rainfall on days when clouds were seeded was eL to eU times as large as when not seeded with 99% confidence. 3 Rainfall Treatment Unseeded 1202.6 Unseeded 830.1 Unseeded 372.4 Unseeded 345.5 Unseeded 321.2 Unseeded 244.3 Unseeded 163.0 Unseeded 147.8 Unseeded 95.0 Unseeded 87.0 Unseeded 81.2 Unseeded 68.5 Unseeded 47.3 Unseeded 41.1 Unseeded 36.6 Unseeded 29.0 Unseeded 28.6 Unseeded 26.3 Unseeded 26.1 Unseeded 24.4 Unseeded 21.7 Unseeded 17.3 Unseeded 11.5 Unseeded 4.9 Unseeded 4.9 Unseeded 1.0 Seeded 2745.6 Seeded 1697.8 Seeded 1656.0 Seeded 978.0 Seeded 703.4 Seeded 489.1 Seeded 430.0 Seeded 334.1 Seeded 302.8 Seeded 274.7 Seeded 274.7 Seeded 255.0 Seeded 242.5 Seeded 200.7 Seeded 198.6 Seeded 129.6 Seeded 119.0 Seeded 118.3 Seeded 115.3 Seeded 92.4 Seeded 40.6 Seeded 32.7 Seeded 31.4 Seeded 17.5 Seeded 7.7 Seeded 4.1 Treatment Rainfall Unseeded 1202.6 Unseeded 830.1 Unseeded 372.4 Unseeded 345.5 Unseeded 321.2 Unseeded 244.3 Unseeded 163.0 Unseeded 147.8 Unseeded 95.0 Unseeded 87.0 Unseeded 81.2 Unseeded 68.5 Unseeded 47.3 Unseeded 41.1 Unseeded 36.6 Unseeded 29.0 Unseeded 28.6 Unseeded 26.3 Unseeded 26.1 Unseeded 24.4 Unseeded 21.7 Unseeded 17.3 Unseeded 11.5 Unseeded 4.9 Unseeded 4.9 Unseeded 1.0 Seeded 2745.6 Seeded 1697.8 Seeded 1656.0 Seeded 978.0 Seeded 703.4 Seeded 489.1 Seeded 430.0 Seeded 334.1 Seeded 302.8 Seeded 274.7 Seeded 274.7 Seeded 255.0 Seeded 242.5 Seeded 200.7 Seeded 198.6 Seeded 129.6 Seeded 119.0 Seeded 118.3 Seeded 115.3 Seeded 92.4 Seeded 40.6 Seeded 32.7 Seeded 31.4 Seeded 17.5 Seeded 7.7 Seeded 4.1 Question 1 a) Mean Standard Deviation 3 7.32042153 4 b) H0 : = 0 H1 : 0 T statistic = (3-0)/sqrt(7.320242/29) = 2.206961795 T critical = df = 29-1 = 28 , = 0.05 = 1.701 We reject the null and conclude that the mean weight change is not equal to zero. C CI = meanz/2*SE = (0.2154606 5.7845394), the true mean is suggested to be positive since the mean ranges within the positive values at 95% significance level. d) Question 2 H0 : 1 - 2 =0 H1 : 1 - 2 0 d 2 T statistic = sum N 2 d d N d 0.67 -0.19 0.09 0.19 0.13 0.4 0.04 0.1 0.5 0.07 0.23 0.59 0.02 0.03 0.11 2.98 d^2 0.4489 0.0361 0.0081 0.0361 0.0169 0.16 0.0016 0.01 0.25 0.0049 0.0529 0.3481 0.0004 0.0009 0.0121 1.387 =(2.98/15)/(sqrt((1.387-(2.98)^2/15))/(15(14)) = 3.2289 T critical = 1.761 We reject null hypothesis and conclude there is a statistical difference in volumes of the left hippocampus between the unaffected and the affected individuals. b) CI for means = mean diff+critical value * standard error (0.996525917, 1.281944219) C Paired t-test data: unaffected and affected t = 3.2289, df = 14, p-value = 0.006062 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 0.0667041 0.3306292 sample estimates: mean of the differences 0.1986667 Question 3 a) Its very appropriate to use independent t test since we are interested in understanding the difference between nonprimed and primed products. b) Two Sample t-test data: nonprimed and primed t = -3.441, df = 39, p-value = 0.001396 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.5916038 -0.4131581 sample estimates: mean of x mean of y 2.950000 3.952381 d) using R (-1.5916038, -0.4131581) e) From the analysis above indicates a p-value of 0.001396 which is less than 0.05 implying we do not reject the null hypothesis, thus there is a stastical difference between primed and nonprimed. Question 4 a) The boxplot above shows a positively skewed data which is highly spread with outliers in both groups. b) The graph below show that the data follows approximately normal with almost, the data seems to be spread equally within the groups unseeded and seeded. ( text mail onsongonyaundi@gmail.com for full response)