2. {environmental studies {3!} pointsj} A plant ecologist performs an exhaustive search of an apprmdmately 4D-square-mile area, chosen to be representative of the growing region of a particular rare species of tree, and nds 101 trees of this species in her search. She records for each tree whether or not it's rooted in serpentine soils [serpentine is a magnesium-rich silicate mineral) and whether its leaves are pubescent {hairy} or smooth, with the following results: of the 35 trees in serpentine soils, 12 had pubescent leaves, and of the 55 trees in non-serpentine soils, 15 had pubescent leaves. {a} Does the difference in percentage of pubescent leaves between the trees in serpentine and non-serpentine soils seem large to you in practical terms? Explain briey. [5 points] (b) Set up a statistical model for this situation, being explicit about the population, sample and imaginary data sets, and use your model [including the usual inferential summary) to build a 95% condence interval for the difference {p1 pg} in population percentages of pubescent leaves (what is the population here, precisely?). Is this difference large in statistical terms? Explain briey. {15 points] (c) Your answers in (a) and (b) should have indicated to you that the plant ecologist found a difference that was practically but not statistically signicant, which (as usual} means that she didn't get enough data. Suppose that she decides to regard the investigation summarised above as part one of a two-part study: based on what she's found in part one, she'll work out how much more data she really needs and then get the rest of the required data in part two. Here is how she might reason, using the condence interval approach to sample size determination. In the overall study [parts one and two combined] she's planning to build a 95% condence interval for [p1 p2), where (say) 1 stands for serpentine soil and 2 for non-serpentine. Using the ideas we developed in class, this condence interval will be of the form A no -a) +2ll-P2l1 [1, {siaii2inai=s1a)e2 \"1 no where p, and fig are the sample proportions of pubescent leaves from the serpentine- soil and non-serpentinesoil trees, respectively, and n; and n; are the numbers of such trees in the overall study. If this interval is to be narrow enough that i] is just barely not inside it (which is what she would need to demonstrate statistical signicance}, then it would have to look like the sketch at the top of the next page. '