Problem Set 4: Using a chi -square test to identify differences in survival rates between different population Early on the morning of 15 April 1912 , the ocean liner RMS Titanic sank in the North Atlantic, on the ship's first voyage. Tragically, approximately 68% of the ship's passengers and crew perished . The table below describes the frequencies of those who survived the catastrophe and those who perished , separated according to passenger class (13, 2 ", and 3 class , plus crew ). Observed Perished Survived Row totals counts 1 class 122 203 325 2" class 167 118 285 ard class 528 178 706 Crew 673 212 885 Column 1490 711 2201 totals The overall survival proportion was approximately 32 % (), but some groups fared better than others : 62 % of the 1"_ class passengers and 41 % of the 2"-class passengers survived , in contrast to 25% of the 3""-class passengers and 24% of the ship's crew. Given this pattern in the sample , it is reasonable to wonder if different groups had unequal access to the means of survival . Conversely , we might wonder if every group was exposed to the same probability of survival , in which case the pattern of variability we see in the sample would be the result of random sampling error alone We can treat the probability of survival as the focus of a null hypothesis . If survival is independent o of passenger class , then we can set our null value to and stat our hypothesis 28- - which we can test with a chi -square test for independence The first step is to calculate the expected counts of those who would have perished and those who would have survived for each group if the null hypothesis were true . The following table gives expected frequencies for each combination of levels of the passenger class variable and the survival outcome variable , . These were calculated by multiplying the corresponding row total and column total in the above table , then dividing by the total sample size 2201