Answered step by step

Verified Expert Solution

Link Copied!

Question

1 Approved Answer

Posted on Jun 27, 2024

Confidence Intervals and Hypothesis Tests Comparing TWO Proportions, Differences or Means AND ANOVA (Analysis of Variance) This first part involves calculating: Confidence Interval for two

Confidence Intervals and Hypothesis Tests Comparing TWO Proportions, Differences or Means AND ANOVA (Analysis of Variance)

This first part involves calculating:

Confidence Interval for two population Proportions
Hypothesis Test for two population Proportions
Hypothesis Test for two population Differences
Confidence Interval for two population differences
Hypothesis Test comparing two population Means

1) (made-up data)Confidence Interval Problem for Proportions: Calculate a 95% confidence interval for these proportions. SHOW YOUR "E" SETUP & CALCULATIONS. The same assumptions must be met to ensure the statistical analysis is valid and Make sure to calculate the proportions first.

High school students take the AP tests in different subject areas. This year 150 students took the biology exam of which 85 were female. And, 220 students took the calculus exam of which 100 were female. Calculate the difference in the PROPORTION of female students taking the biology exam versus female students taking the calculusexam using a 95% confidence level.

E = z-critical * sqrt[ (p-hat1 * q-hat1)1) + (p-hat2 * q-hat2)2)]

2) (made-up data)Hypothesis Test problem for Proportions: Assume a 1% significance level. The same assumptions must be met to ensure the statistical analysis is valid and calculate the proportions first.

SHOW Z-Test SETUP AND Z-Critical value the z-Test was compared to. Also, use software to calculate the p-Value. Do both support the same conclusion? (they MUST). State your real-world conclusion?

Are more children diagnosed with Autism Spectrum Disorder in States that have urban areas over States that are mostly rural? In an urban State, there were are 300 3^rd grade students diagnosed with ASD out of 18,500 3^rd graders tested. In a rural state, there were 55 3^rd grade students diagnosed with ASD out of 2,300 3^rd graders tested. Is there enough evidence to show that the proportion of children diagnosed with ASD in an urban State is more than the proportion in a rural State? Test at the ? = 1% level.

3) Coin-Toss Data- We had ____ sets of 30-flipped coin sets. Some used ONE coin and some 30 different coins. Conduct a hypothesis test on the proportion of HEADS using ONE coin versus the proportion of HEADS using 30 coins. Is there a significant difference (5% level)?

t-Test = [(x-bar1 - x-bar2) -(1 - 2)] / sqrt ( s₁²1 + s₂²2)

To find the t-Critical we need the degrees of freedom (df) and the formula to calculate df is "hairy". As presented earlier, the simpler formula that is used by some statisticians: df = (n1 - 1) + (n2 - 1)

4) (made-up data) Hypothesis Test problem for TWO population DIFFERENCES (not proportions or means) comparison hypothesis test (NOT a proportion problem - so we use t-values, not z-values). SHOW t-TEST SETUP and calculation, AND for the t-Critical, what SIGNIFICANCE LEVEL (the column) and what df (which row) were used and what the t-Critical value is.

Do people avoid driving on Friday the 13^th? Here are data on 20 Fridays: 10 on the 13^th and the other 10 on the Friday before. For the DIFFERENCES, Calculate the MEAN and SD between traffic on these two Fridays at the 95% level(alpha = 5%);

Traffic Count

Dates	6^th (x1)	13^th (x2)	d = (x1 - x2)
1990, July	139246	138548
1990, July	138012	132908
1991, September	137055	136018
1991, September	140675	131843
1991, December	123552	121641
1991, December	121139	120723
1992, March	128293	125532
1992, March	124631	120249
1992, November	124609	125770
1992, November	119584	116263

For the differences (all d's) the MEAN = _______ and the SD = __________

TEST STATISTIC: t-Test = (d-bar -_d)/s_d /sqrt(n) = ____; The t-Critical with (n-1) df for? = 5% is______

What is the software-generated p-value for the t-Test? ______

What is your math conclusion and your real-world conclusion?

5) Confidence Interval problem for two population DIFFERENCES (not means or proportions) that have the SAME NUMBER OF DATA VALUES (n1 = n2).

Using the same traffic count data and statistics as above, the margin of error, E, the formula isE = t-critical * s_d/sqrt(n)

We will assume a significance level of? = 5%(95% confidence level) with n=10 data points, hence df = (10 - 1 ) = 9, so t-critical =______ ?

What are the MEAN ______and SD _____for the "differences"?

E = _______ * _______ / ______ = __________ (fill in the different values and the calculated result) .

Our Confidence Intervalis (d-bar - E) d d

Real-world Conclusion: Whose median income mean is larger and by how much?

(IF this or any CI includes zero there is NO difference)

(5) (made-up data) Hypothesis Test problem comparing two population MEANS. Make sure to show your setup for the t-Test, df, and how you found the t-Critical.Also, use software to find the p-value (probability) is it larger or smaller than?? Does this conclusion match the one based on the t-Test versus t-Critical? (It must). For the p-value, make sure you know which area (right or left) you need to use (greater than or less than). (If NO Confidence Level or? is provided assume 95% and 5%, respectively)

Median Income for Males

$42,951	$52,379	$42,544	$37,488	$49,281	$50,987	$60,705
$50,411	$66,760	$40,951	$43,902	$45,494	$41,528	$50,746
$45,183	$43,624	$43,993	$41,612	$46,313	$43,944	$56,708
$60,264	$50,053	$50,580	$40,202	$43,146	$41,635	$42,182
$41,803	$53,033	$60,568	$41,037	$50,388	$41,950	$44,660

Median Income for Females

$31,862	$40,550	$36,048	$30,752	$41,817	$40,236	$47,476
$60,332	$33,823	$35,438	$37,242	$31,238	$39,150	$34,023
$33,269	$32,684	$31,844	$34,599	$48,748	$46,185	$36,931
$29,548	$33,865	$31,067	$33,424	$35,484	$41,021	$47,155
$42,113	$33,459	$32,462	$35,746	$31,274	$36,027	$37,089

t-Test = [(x-bar1 - x-bar2) -(1 - 2)] / sqrt ( s₁²1 + s₂²2)

(If you want to copy the above Tables into Excel, highlight the entire Table, right-click and "COPY" than before you paste, highlight the same number of columns (7 here) across Excel, then hit "paste". If you only highlight one column it may not do anything or it may put all the data vertically in that column. )

6) AnotherHypothesis Test problem comparing two populations MEANS for the two-medication "time to take effect" example. Work it just as you did the prior problem (#5)

t-Test = [(x-bar1 - x-bar2) -(1 - 2)] / sqrt ( s₁²1 + s₂²2)

To find the t-Critical we need the degrees of freedom (df): df = (n1 - 1) + (n2 - 1)

Also, find the p-value

Bottom line: Is there a significant difference between the mean "time to take effect" for these two medications and if so, which medication is significantly faster.

MOVING ON TO ANOVA (Analysis of Variance)

We have just finished comparing the means of TWO data sets to determine if there is a significant difference between them. We have used the z-Test (proportions) and the t-Test to do this. We FINALLY find a use for the VARIANCE!

We haven't used the variance very much in this course, just its square root, the standard deviation.

There is a LOT of squaring (and maybe swearing) in ANOVA calculations and if you recall from your Week 1 exercise in calculating variance, we squared distances to make any negative ones, positive. This is what we are essentially doing here but in multiple dimensions.

We are basically comparing the Means and Variances of multiple populations (three or more) to see if ANY are significantly different from the others. (see the Additional Guidance attachment for assistance.) It will NOT tell us which population is different, however.

7) (made-up data) WORK THIS PROBLEM:

Plant 1	Plant 2	Plant 3	Plant 4	Plant 5
1.2	16.4	12.1	11.5	24
10.1	-6	9.7	10.2	-3.7
-2	-11.6	7.4	3.8	8.2
1.5	-1.3	-2.1	8.3	9.2
-3	4	10.1	6.6	-9.3
-0.7	17	4.7	10.2	8
3.2	3.8	4.6	8.8	15.8
2.7	4.3	3.9	2.7	22.3
-3.2	10.4	3.6	5.1	3.1
-1.7	4.2	9.6	11.2	16.8
2.4	8.5	9.8	5.9	11.3
0.3	6.3	6.5	13	12.3
3.5	9	5.7	6.8	16.9
-0.8	7.1	5.1	14.5
19.4	4.3	3.4	5.2
2.8	19.7	-0.8	7.3
13	3	-3.9	7.1
42.7	7.6	0.9	3.4
1.4	70.2	1.5	0.7
3	8.5
2.4	6
1.3	2.9

What is the NULL Hypothesis, Ho:

What is the Alternate Hypothesis, Ha:

In the formula: "k" is simply the number of groups and capital "N" is the TOTAL of ALL the data values involved (all groups together) and small "n" is the AVERAGE number of data points per group = (N/k)

The 5 groups involved have DIFFERENT numbers of samples (n), but they are relatively close, so I used the AVERAGE "n" and got an F-Test close to the text value. This is NOT technically the official way to calculate F, but the official way is quite complex for an introductory overview)

	Plant 1	Plant 2	Plant 3	Plant 4	Plant 5
n
Mean
Variance

Fill in the Table above ^
Determine k = _____
Calculate N = ______
Calculate the average n = ____
Calculate the VARIANCE of the 5 sample means = ______ *average-n = _______ = _________
Calculate the AVERAGEVARIANCE of the 5-groups' variances = _____________

F-Test = (e) / (f) = _________(This is by using the average n)

(The Text software gets an F-Test of 1.16)

To find the F-Critical value:

Calculate the first degrees of freedom: df1 = k - 1 = ____
Calculate the second degrees of freedom:df2 = N - k = ____
Find the F-Critical from the F-Table below for these degrees of freedom (df) and? = 1% : ___________
Compare your F-Test to the F-critical and do you accept or reject Ho?____________

You can use this website:https://stattrek.com/online-calculator/f-distribution.aspx to find the p-value (probability) for a given F-Test statistic. What is the p-value for the F-Test you calculated? ___________

Comparing these to the? = 0.01 or 1%, would you draw the same Reject/Fail to reject Ho conclusion?

8) COIN TOSS RESULTS: We had ____ sets of 30-flipped coin sets. Some used ONE coin and some 30 different coins. Is there a significant difference in those results, i.e., are any numbers of HEADS significantly different in the one-coin group or in the 30 coin group? Keep in mind that even IF there is a significant difference, we won't be able to tell which data set(s) is or are different. Here is a Table of those "HEADS" results (number out of 30 tosses) and the sampling protocol used (one coin or 30 coins)

Now, IF you want a video on how to REALLY work ANOVA for different sample sizes, check one of these (or both) out:

https://www.youtube.com/watch?v=0XsovsSnRuw

https://www.youtube.com/watch?v=q48uKU_KWas

Probability p F*F critical values Degrees of freedom in the numerator P 1 3 4 5 7 100 3.01 2.62 2.42 2.29 2.20 2.13 2.08 2.04 2,00 .050 4.41 3.55 3.16 2.93 2.77 266 2.58 2.51 2.46 18 025 5.98 4.56 3.95 3.61 3.38 3.22 3,10 3.01 2.93 010 8.20 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 15.38 10.39 8.45 7.46 6.81 6.35 6.02 5.76 5.56 .100 2.99 2.61 2.40 2.27 2.18 2.11 2.06 2.02 1.98 .050 4.38 3.52 3.13 2.90 2.74 263 2.54 2.48 2.42 19 025 5.92 4.51 3.90 3.56 3.33 3.17 3.05 2.96 2.88 .010 8. 18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 15.08 10.16 7.27 6.62 6.18 5.85 5.59 5.39 100 2.97 2.59 2.38 2.25 2.16 2.09 2.04 2.00 1.96 .050 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 20 025 5.87 4.46 3.86 3.51 3.29 3.13 3.01 2.91 2.84 .010 8, 10 5.85 4.94 4.43 4.10 3.87 3,70 3.56 3.46 001 14.82 9.95 8.10 7.10 6.46 6.02 5.69 5.44 5.24 100 2.96 2.57 2.36 2.23 2.14 208 2.02 1.98 1.95 .050 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 21 .025 4.42 3.8 3,48 3.25 3.09 2.97 2.87 2.80 010 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 001 14.59 9.37 7.94 6.95 6.32 5.56 5-31 5.11 100 2.95 2.56 2.35 2.22 2.13 2.06 2.01 1.97 1.93 .050 1,30 3.44 3.05 2.82 2.66 2.46 2.40 2.34 22 .025 5.79 4.38 3.78 3.44 3.22 3.05 2.93 2.84 2.76 .010 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 001 14.38 9.61 7 80 6.81 6.19 5.76 5.44 5.19 4.90 Degrees of freedom in the denominator .100 2.94 2.55 2.21 2.11 2.05 1.94 1.95 1.92 .050 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 23 .025 $.75 4.35 3.75 3.41 3.18 3.02 2.90 2.81 2.73 010 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 1001 14.20 9.47 7.67 6.70 6.08 5.65 5.33 5.09 4.89 2.93 2.33 2.19 2.10 2.0 1.98 1.94 1.91 050 4.26 3.40 3.01 2.78 2.62 251 2.42 2.36 2.30 24 5.72 3.72 3.38 3.15 2 90 2.87 2.78 2.70 010 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 .001 14.03 9.34 7.55 6.59 5.98 5.55 5.23 4.80 .100 2.92 2.53 2.32 2.18 2.09 2 02 1.97 1.93 1.89 1050 4.24 3.39 2.99 2.76 2.60 249 2.40 2.34 2.28 25 .025 5.69 4.29 3.69 3.35 3.13 2.97 2.85 2.75 2.68 010 7.77 5.57 4.65 4.18 3.85 3.63 3.46 3.32 3.22 .001 13.88 9.22 7.45 6.49 5.89 5.46 5.15 4.91 4.71 100 2.91 252 2.31 2.17 2.08 201 1.96 1.92 1.88