In this lab, we will implement some of the statistical methods discussed in Chapter 15. In particular, we will carry out chi-square test of the association between two categorical variables. Donmload the worksheets for this lab, abilitymtw and MPHmtw, from LONCAPA to your computer, A. Samples of 200 13-year-old girls and 200 13-year~old boys were asked how they perceive their own math ability; these responses were categorized into ve categories: hopeless, below average, average, above average, or superior. Note that these are self-ratings of students' perceptions, and these ratings do not necessarily apply to the students' true mathematics ability. Use File > Open Worksheet to load the data from abllitymtw. Here there are t=2 populations (rows) and $5 categories of response (columns). We will use these data to test the hypotheses Hg: girl: and boys have the same distribution ofrheirpercefved moth chitin; versus H\": not so. Place the observed cell frequencies given in the worksheet in the table below. Calculate the row and column totals and ll them in. Calculate the expected cell frequencies using the rule E = (Row total)*(Cohuru1 total)f{Grand total) and ll them in by placing the values in parentheses in each cell. Perceived Math Ability Hopeless l Below average Average Above average 1 Superior Total Girls Boys Total 1_ For the cell (Girls, Hopeless), what is the expected count? 2. For the cell (Girls, Hopeless), what is the observed count? 3. For the cell (Girls, Hopeless), what is the contribution from this cell to the 12 test statistic? Contribution = (Obsa'ved Expected )2 _ Expected We will now use Minitab to perform the hypotheses test, Use Statistics > Tables > Cross Tabulation and Chi- Square. In the drop-down menu, choose the option Summarized data in a two-way table. Select cohtmns hopelesssuperior, cZc, as columns containing the table. Click on the Display tab, and check the following options: \"Chi-square test for association\" and \"Expected cell counts.\" The results are: 4. Test statistic, Pearson 3:2 = , 5. Degrees of 'eedom = 6. Pvalue = . 7. What decision is reached at level 0.05? 8. Interpret your decision: Do boys and girls have the same distribution of their perceived math ability? Check Minitab's output for the hypotheses test includes the observed and expected counts for each cell, as well as the contribution from each cell into the test statistic 352. Do you answers to questions 1-3 agree with Minitab?_ B. We will non.r look at some hypothetical (not real) data of performance of males and females on a mathematics test. Use File 2* Open Worksheet to load the data from MTH.mtw. Unlike part A above, you are given raw data (not cable simimary). In the worksheet, student's sex had labels F for female, and M for male. The mathematics test scores were categorized as mathematics levels 16. The levels are ordered from 1 being the lowest (innoduetory) to 6 being the highest {advanced}. As an example, think of MSU mathematics placement test, where scores in certain ranges correspond to students placing into courses of various levels: algebra, precalcuius, calculus. etc. For example, the rst row of data conesponds to a female who had math test score of level 1. To compare the performance of males and females on this mathematics test, consider the hypotheses: Ha: males and females have the same distribution thheir math score level versus H,.' not so. To carry out a test of hypotheses, use Statistics>Tables>Cross Tabulation and Chi-Square. In the drop-down menu, use "Raw data" (categorical variables). Select "Gender" for rows, "MTH_level" for columns. Then click on the Display tab and check the following: "Chi-square test for association," "Expected cell counts," and "Each cell's contribution to chi-square." Use Minitab's output to answer the questions: 9. The value of the test statistic (Pearson chi-square) is 10. Degrees of freedom is 1 1. The p-value is 12. What decision is reached at 0.05 level of significance? 13. Interpret your decision. Is there a difference between males and females in the distribution of their math score level? When the difference in performance is captured by the chi-square test, the test does not formally point out where the difference comes from, for example, from lowest, middle, or highest levels of performance. An indication of the sources for the differences can be found by examining the contributions from each cell into the chi-square test statistic given in Minitab's output. Examine the last row of each cell in the output to see that the largest values of the (Observed - Expected )" correspond to mathematics level 1. We can do an additional test to determine if the Expected proportions of males and females who scored at level 1 differ. Column c10 is labeled "MTH_modified". It has value 1 if a person scored at level 1 on the mathematics test, and value 0 otherwise. Repeat the chi-square analysis for this modified column by using Stat>Tables>Cross Tabulation and Chi-Square. Select "Gender" for rows, "MTH_modified" for columns. Click on the Display tab and check the appropriate boxes to display chi-square test for association and expected cell counts, and each cell's contribution to the chi-square statistic. Use Minitab's output to answer the questions: 14. The value of the test statistic (Pearson chi-square) is 15. Degrees of freedom is 16. P-value is When each categorical variable has 2 levels, and the resulting contingency table is 2 by 2, the comparison of the proportions of males and females who score at level 1 can be done using a 2-sample z-test for proportions (Section 12.3, text). In this case the hypotheses are Ho : P, - P2 =0 versus Ha : P, - P, # 0 The Z test statistic squared equals the chi-square test statistics, and the p-values found using two approaches are the same because chi-square distribution with one degree of freedom (see question #15) is the same as Z (standard normal) squared. To see this, use the data in the last Minitab output to get: The sample proportion of females who scored at level 1: p, =63/415=.1518 17. The sample proportion of males who scored at level 1: p, = Please keep 4 decimals. 18. The pooled estimate of the common proportion of all students who scored at level 1 (see p. 487, text) D = MP, +n P2 _ X, + X 2 19. The z test statistic z = P1 - P 2 Please keep at least 2 decimals. Spa-P)/ 1 1 n2 20. Use Statistics>Probability Distributions>Cumulative Distribution Function with "Normal" distribution selected. Enter the negative value (a symmetrical point to the test statistic calculated above) as the "Value", and click OK. The output will be the amount in a left tail. Since the alternative is two-tailed, the p-value is Compare your results: is the squared answer to #19 the same as answer to #14 (up to rounding error)? Are the answers to #16 and #20 the same (up to rounding error)