Question

1 Approved Answer

Posted on May 17, 2024

In this assignment, you will work with 4 variables, each found in the file gss2.rdata, which can be downloaded from the course Brightspace page. They

In this assignment, you will work with 4 variables, each found in the file gss2.rdata, which can be downloaded from the course Brightspace page. They are: lfstatus, coded 1. employed full time 2. employed part time 3. temporarily not working 4. laid off or unemployed 5. retired 6. at school 7. keeping house marital, coded 1. married 2. widowed 3. divorced 4. separated 5. never married classicl, the respondent's rating of classical music, coded 1. like very much 2. like 3. mixed feelings 4. dislike 5. dislike very much country, the respondent's rating of country and western, coded 1. like very much 2. like 3. mixed feelings 4. dislike 5. dislike very muchComputing Section (26 points) Using commands illustrated in warm2.r or used earlier in assignment 1, you are asked: 1. To obtain graphs (histograms, barplots, boxplots, as appropriate) for the four variables identified above. 2. To obtain appropriate measures of central tendency and dispersion for each. 3. To obtain a crosstabulation showing labour force status (lfstatus) by marital status (marital). As shown in the warmup, you can get it through the CrossTable command, which is contained in the library gmodels. 4. To get the two types of association plot shown in the warmup for the same variables (lfstatus and marital). 5. To obtain a crosstabulation between liking for classical music (classicl) and liking for country music (country). 6. Get the two types of association plot for these two variables. Written Section (89 points) 5. With the code you have, you will not get lambda for the table showing labour force status by marital status, so you are asked to calculate it, showing your calculations, then explain what it means. Examples are in the text. (8 points) 6. Explain the output from the chi-square test below the crosstabulation of lfstatus by marital. (Just saying that the results are 'statistically significant' or not will be worth no points, since you could reach this conclusion by a lucky guess.) Explain how the value of chi-square was arrived at, what 'df' means, and what the p-value tells us. (9 points) 7. For these variables, explain what the two association plots tell us. (8 points) 8. For 'classical' and 'country', consider the association plot displaying the standardized (Pearson) residuals, and explain what the residuals tell us. In your answer, refer to any patterns you see, and to the meaning of the residuals. (8 points) 9. You will not get a value for gamma or d for the tables showing 'country' by 'classical' from the code you have in the example. So you won't need to calculate them for so large a table, or use other code, here they are:gamma = -.142, s.e. = .030, t = -4.712, p = .000 d = -.110, s.e. = .022, t = -4.712, p = .000 In your writeup, you should explain why gamma and d are applicable to this table, whereas lambda was used for the first one. Next, you should explain what gamma tells us about the link between the two variables. Is there a case for using Somers'd instead? Why do you say this? For this table, what does chi-square tell us? (18 points) 10. Is gamma statistically significant? How do you know? What does the phrase 'statistically significant' mean? The confidence interval for gamma in these data goes approximately from -.201 to -.083. What does this tell us? (10 points) 11. Make and hand in a table, in presentation form, as shown in the text (pages 174-77, based on the crosstabulation of classicl and country. (8 points)