Question

1 Approved Answer

Posted on Oct 13, 2024

Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 6 Assignment Drawing Conclusions Suppose that study A and study B

Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 6 Assignment Drawing Conclusions Suppose that study A and study B both involve comparing two groups. Study A is an observational study, and the value turns out to be value turns out to be . Study B is a randomized experiment, and the . a. Which study provides stronger evidence that the observed results are unlikely to have occurred by chance alone if there really were no difference between the groups? Study B because it is a randomized experiment. Study B because it has the smaller Study A because it has the larger value. value. Study A because it is an observational study. b. Which study provides stronger evidence of a causeandeffect relationship between the variables? Study B because it is a randomized experiment. Study B because it has a smaller Study A because it has a larger value. value. Study A because it is an observational study. Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 6 Assignment Editorial Styles USA Today is known as the "nation's newspaper." It strives to reach a very broad readership and is, therefore, reputed to be written at a fairly low readability level. On the other hand, the Washington Post generally has a reputation as a more serious newspaper aiming for a more intellectual readership. To assess whether quantitative data can reveal any evidence to support this reputation, consider the following data on sentence lengths (measured by number of words in a sentence) for the lead editorials from January 22, 2007, in USA Today and in the Washington Post: a. Using the fivenumber summaries provided and the table of sentence lengths, comment on key features of the distributions. Consider two numbers to be roughly equal if the difference is less than two. The lower quartile of USA Today is the Washington Post's lower quartile. The median of USA Today is the Washington Post's median. The upper quartile of USA Today is the Washington Post's upper quartile. The mean of USA Today is the Washington Post's mean. USA Today has the Washington Post. b. Are the technical conditions for a twosample test satisfied? Both conditions are satisfied. Neither condition is satisfied. The sample is not random. The sample is not large enough. c. Conduct a twosample test to assess whether the sample data support the contention that the sentences in USA Today are shorter than those in the Washington Post. Report all aspects of the test and summarize your conclusion. Round your test statistic to two decimal places. Use the is no lower limit on the value, enter distribution table to find a range for the for the lower limit. If there is no upper limit on the the upper limit. Note that group refers to USA Today and group refers to Washington Post. value. If there value, enter for The test statistic is *1 . *2 Based on the value value, we *3 . Answer *1: the tolerance is +/2% Answer *2: exact number, no tolerance Answer *3: exact number, no tolerance d. Select all of the quantitative variables that could have been measured to compare the editorial styles of the two newspapers. Font color Number of pronouns per passage Average number of syllables per word Subject of article Percentage of distinct words Location of publisher Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 6 Assignment Mice Cooling Medical examiners can use the temperature of a dead body at a murder scene to estimate the time of death. But can a clever murderer disguise the time of death by reheating the victim's body? A scientist actually investigated this issue on mice. Hart (1951) used mice as the experimental units. He sacrificed each mouse and then measured the cooling constant of its body. Then he reheated the mouse's body and measured its cooling constant in that reheated state. The results are shown in the following table: a. Do these data call for a matchedpairs analysis? b. Produce numerical summaries for investigating the question of whether cooling constants for reheated mice are similar to those of freshly killed mice. Round your answers to one decimal place. *1 *2 c. Conduct the appropriate test of whether the data suggest a significant difference in average cooling constants between freshly killed and reheated mice. Include a check of technical conditions, and summarize your conclusion. Round your test statistic to two decimal places. Use the the value. If there is no lower limit on the no upper limit on the Note that group value, enter distribution table to find a range for value, enter for the lower limit. If there is for the upper limit. refers to freshly killed mice and group refers to reheated mice. Technical Conditions: The test statistic is *3 . value *4 We We have mice differ. d. Construct a at the *5 significance level. that the cooling constants of freshly killed mice and reheated confidence interval for estimating the population mean difference in cooling constants. Round your answers to two decimal places. The confidence interval is ( Answer *1: the absolute tolerance is +/0.01 *6 , *7 ). Answer *2: the absolute tolerance is +/0.01 Answer *3: the absolute tolerance is +/0.01 Answer *4: exact number, no tolerance Answer *5: exact number, no tolerance Answer *6: the absolute tolerance is +/0.05 Answer *7: the absolute tolerance is +/0.05 Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 8 Assignments Life Expectancy Think about a country's life expectancy as a response variable, and consider three explanatory variables: fertility rate (number of children per woman), gross domestic product (GDP, an indication of how vibrant a country's economy is), and number of Internet users per people. a. For each of these three explanatory variables, indicate whether you expect the association with life expectancy to be positive, negative, or virtually no association. Life expectancy vs. fertility rate: association Life expectancy vs. GDP: association Life expectancy vs. Internet users per people: association b. Make a conjecture about which of these three explanatory variables will have the strongest association with life expectancy. will have the strongest association with life expectancy. Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 8 Assignments Muscle Fatigue Do women or men experience fatigue more quickly while exercising? To investigate this issue, researchers recruited healthy young adults to participate in a study (Hunter et al., 2004). The researchers first measured the strength of each person by the torque exerted at the wrist during a maximal voluntary contraction. Then they matched up each of the a woman of comparable strength (within men in the study with of the maximal torque exerted). Next, they asked each subject to complete a prescribed series of exercises with their elbow flexor muscles and elbow extensor muscles until they lacked the strength to continue. Researchers recorded how long it took before failure at this exercise task for each subject. a. Calculate the correlation coefficient between time until muscle fatigue for men and time until muscle fatigue for women. Round your answer to three decimal places. *1 b. Comment on what this correlation coefficient suggests about whether men and women of similar strength tend to have similar times until muscle fatigue. The correlation coefficient suggests there is between fatigue times. Answer *1: the absolute tolerance is +/0.001 Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 8 Assignments Kentucky Derby The Kentucky Derby is the most famous horse race in the world, held annually on the first Saturday in May at Churchill Downs race track in Louisville, Kentucky. This race has been called "The Most Exciting Two Minutes in Sports" because that's about how long it takes for a horse to run its mile track. Consider the winning time (in seconds) for every year since 1896. a. For predicting winning time from year, the least squares line has intercept coefficient seconds and slope coefficient seconds per year. Use this information to report the equation of the least squares line, using good statistical notation. Use "year" for the name of the predictor variable. b. Based on the following residual plot, comment on whether a straight line appears to be a reasonable model for the relationship between winning time and year. A straight line the best model for this relationship. Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 8 Assignments Draft Lottery The United States Selective Service conducted a lottery to decide which young men would be drafted into the armed forces (Fienberg, 1971). Each of the birthdays of the year was assigned a draft number. Young men born on days assigned low draft numbers were drafted. The correlation coefficient between draft number and sequential birthday for the 1970 lottery was . a. Determine the probability that a fair random lottery would produce a correlation coefficient so far from zero, simply by random chance. In other words, determine the value for testing whether the population correlation coefficient equals zero, against a twosided alternative. Report the test statistic as well as the value. [Hint: There were Round the test statistic to two decimal places. Use the value. If there is no lower limit on the upper limit on the value, enter numbers in this draft.] distribution table to find a range for the value, enter for the lower limit. If there is no for the upper limit. The test statistic is *1 *2 value *3 The correlation coefficient between draft number and sequential birthday for the 1971 lottery was . b. Repeat part a for the 1971 lottery. Round your answer for the test statistic to two decimal places. Use the find a range for the value. If there is no lower limit on the limit. If there is no upper limit on the The test statistic is value, enter *4 distribution table to value, enter for the upper limit. for the lower value *5 c. Summarize what these *6 values reveal about the fairness of these two lotteries. We for the 1970 draft process. The 1970 draft process was for the 1971 draft process. The 1971 draft process was . We . Answer *1: the absolute tolerance is +/0.01 Answer *2: exact number, no tolerance Answer *3: exact number, no tolerance Answer *4: the absolute tolerance is +/0.01 Answer *5: exact number, no tolerance Answer *6: exact number, no tolerance Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 10 Assignment Suicides Many songs associate Mondays with having the blues, but are people really more likely to commit suicide on Mondays? A recent study (Kposowa et al., 2009) analyzed data on suicides in the United States between the years 2000 and 2004. They classified each suicide according to the day of the week on which it occurred. Their data produced the following sample percentages: A news article that described this study did not report the sample size. Suppose for now that the sample size was . The computer output below presents results of a chisquare goodnessoffit test of equal proportions: a. Explain how the expected counts were calculated. The sample size is divided by percentage of suicides for each day. Seven is divided by the sample size. The percentage of suicides for each day is divided by the sample size. The sample size is divided by seven. The percentage of suicides for each day is multiplied by the sample size. b. Choose how the chisquare contribution for Monday was calculated. c. Interpret what the The value means in this context. value is the probability of obtaining sample counts that result in a small chisquare test statistic, assuming that the probability of a suicide on each of the seven weekdays is not the same. The value is the probability of obtaining sample counts that result in a small chisquare test statistic, assuming that the probability of a suicide on each of the seven weekdays is the same. The value is the probability of obtaining sample counts that result in a large chisquare test statistic, assuming that the probability of a suicide on each of the seven weekdays is not the same. The value is the probability of obtaining sample counts that result in a large chisquare test statistic, assuming that the probability of a suicide on each of the seven weekdays is the same. d. Summarize the conclusion that you would draw from this value. You have no statistical evidence that the probability of a suicide occurring on any of the seven weekdays is not the same. You have weak statistical evidence that the probability of a suicide occurring on any of the seven weekdays is not the same. You have strong statistical evidence that the probability of a suicide occurring on any of the seven weekdays is not the same. e. Which day appears to be especially likely for suicides? Monday Tuesday Wednesday Thursday Friday Saturday Sunday f. If the sample size were larger, and the sample percentages stayed the same, how would the value change? The value would increase. The value would decrease. The value would stay the same. Question Attempts: 0 of 5 used Print by: Mary Stela Gallegos HS312: Epidemiology and Biostatistics II 1504C / HS312 Unit 10 Assignment Preventing Breast Cancer The Study of Tamoxifen and Raloxifene (STAR) enrolled more than postmenopausal women who were at increased risk for breast cancer. Women were randomly assigned to receive one of the drugs (tamoxifen or raloxifene) daily for five years. During the course of the study, researchers kept track of which women developed invasive breast cancer and which women did not. The initial results released in April 2006 revealed that of tamoxifen group had developed invasive breast cancer, compared to women in the of the women in the raloxifene group. a. Organize this information into a table. Tamoxifen Raloxifene Total *1 *2 *3 *4 *5 *6 *7 *8 *9 Breast Cancer No Breast Cancer Total b. Calculate the expected counts, under the null hypothesis that the population proportions of women who would develop breast cancer are the same with both treatments. Round your answers to three decimal places. Tamoxifen Breast Cancer Raloxifene *10 *12 No Breast Cancer *11 *13 c. Conduct a chisquare test on these data. Report the hypotheses, test statistic, and value. Summarize your conclusion. Round the test statistic to three decimal places. Use the chisquare distribution table to find a range for the value. If there is no lower limit on the there is no upper limit on the value, enter value, enter for the upper limit. for the lower limit. If The test statistic is *14 . value *15 We . d. Suppose someone else decided to perform a twosample and value *16 test and found that Comment on how this compares to the test statistic and value you just calculated. The test statistic of the chisquare test is roughly the twosample the test statistic of test. Answer *1: exact number, no tolerance Answer *2: exact number, no tolerance Answer *3: exact number, no tolerance Answer *4: exact number, no tolerance Answer *5: exact number, no tolerance Answer *6: exact number, no tolerance Answer *7: exact number, no tolerance Answer *8: exact number, no tolerance Answer *9: exact number, no tolerance Answer *10: the absolute tolerance is +/0.001 Answer *11: the absolute tolerance is +/0.001 Answer *12: the absolute tolerance is +/0.001 Answer *13: the absolute tolerance is +/0.001 Answer *14: the absolute tolerance is +/0.001 Answer *15: exact number, no tolerance Answer *16: exact number, no tolerance Question Attempts: 0 of 5 used Drawing Conclusions Suppose that study A and study B both involve comparing two groups. Study A is an observational study, and the -value turns out to be . Study B is a randomized experiment, and the -value turns out to be . a. Which study provides stronger evidence that the observed results are unlikely to have occurred by chance alone if there really were no difference between the groups? Study B because it is a randomized experiment. Study B because it has the smaller Study A because it has the larger -value. -value. Study A because it is an observational study. SHOW HINT b. Which study provides stronger evidence of a cause-and-effect relationship between the variables? Study B because it is a randomized experiment. Study B because it has a smaller Study A because it has a larger -value. -value. Study A because it is an observational study. Editorial Styles USA Today is known as the "nation's newspaper." It strives to reach a very broad readership and is, therefore, reputed to be written at a fairly low readability level. On the other hand, the Washington Post generally has a reputation as a more serious newspaper aiming for a more intellectual readership. To assess whether quantitative data can reveal any evidence to support this reputation, consider the following data on sentence lengths (measured by number of words in a sentence) for the lead editorials from January 22, 2007, in USA Today and in the Washington Post: a. Using the five-number summaries provided and the table of sentence lengths, comment on key features of the distributions. Consider two numbers to be roughly equal if the difference is less than two. The lower quartile of USA Today is The median of USA Today is The upper quartile of USA Today is The mean of USA Today is USA Today has the Washington Post's lower quartile. the Washington Post's median. the Washington Post's upper quartile. the Washington Post's mean. the Washington Post. SHOW HINT b. Are the technical conditions for a two-sample Both conditions are satisfied. Neither condition is satisfied. The sample is not random. The sample is not large enough. -test satisfied? SHOW HINT c. Conduct a two-sample -test to assess whether the sample data support the contention that the sentences in USA Today are shorter than those in the Washington Post. Report all aspects of the test and summarize your conclusion. Round your test statistic to two decimal places. Use the for the -value. If there is no lower limit on the there is no upper limit on the Note that group -value, enter -distribution table to find a range -value, enter for the upper limit. refers to USA Today and group The test statistic is refers to Washington Post. . -value Based on the SHOW HINT -value, we for the lower limit. If . d. Select all of the quantitative variables that could have been measured to compare the editorial styles of the two newspapers. Font color Number of pronouns per passage Average number of syllables per word Subject of article Percentage of distinct words Location of publisher Mice Cooling Medical examiners can use the temperature of a dead body at a murder scene to estimate the time of death. But can a clever murderer disguise the time of death by reheating the victim's body? A scientist actually investigated this issue on mice. Hart (1951) used mice as the experimental units. He sacrificed each mouse and then measured the cooling constant of its body. Then he reheated the mouse's body and measured its cooling constant in that reheated state. The results are shown in the following table: a. Do these data call for a matched-pairs analysis? b. Produce numerical summaries for investigating the question of whether cooling constants for reheated mice are similar to those of freshly killed mice. Round your answers to one decimal place. c. Conduct the appropriate test of whether the data suggest a significant difference in average cooling constants between freshly killed and reheated mice. Include a check of technical conditions, and summarize your conclusion. Round your test statistic to two decimal places. Use the the -value. If there is no lower limit on the no upper limit on the Note that group -value, enter -distribution table to find a range for -value, enter for the lower limit. If there is for the upper limit. refers to freshly killed mice and group refers to reheated mice. Technical Conditions: The test statistic is . -value We We have reheated mice differ. at the significance level. that the cooling constants of freshly killed mice and d. Construct a confidence interval for estimating the population mean difference in cooling constants. Round your answers to two decimal places. The confidence interval is ( , ). Life Expectancy Think about a country's life expectancy as a response variable, and consider three explanatory variables: fertility rate (number of children per woman), gross domestic product (GDP, an indication of how vibrant a country's economy is), and number of Internet users per people. a. For each of these three explanatory variables, indicate whether you expect the association with life expectancy to be positive, negative, or virtually no association. Life expectancy vs. fertility rate: association Life expectancy vs. GDP: association Life expectancy vs. Internet users per people: association b. Make a conjecture about which of these three explanatory variables will have the strongest association with life expectancy. will have the strongest association with life expectancy. Muscle Fatigue Do women or men experience fatigue more quickly while exercising? To investigate this issue, researchers recruited healthy young adults to participate in a study (Hunter et al., 2004). The researchers first measured the strength of each person by the torque exerted at the wrist during a maximal voluntary contraction. Then they matched up each of the men in the study with a woman of comparable strength (within of the maximal torque exerted). Next, they asked each subject to complete a prescribed series of exercises with their elbow flexor muscles and elbow extensor muscles until they lacked the strength to continue. Researchers recorded how long it took before failure at this exercise task for each subject. a. Calculate the correlation coefficient between time until muscle fatigue for men and time until muscle fatigue for women. Round your answer to three decimal places. b. Comment on what this correlation coefficient suggests about whether men and women of similar strength tend to have similar times until muscle fatigue. The correlation coefficient suggests there is times. the absolute tolerance is +/-0.001 between fatigue Kentucky Derby The Kentucky Derby is the most famous horse race in the world, held annually on the first Saturday in May at Churchill Downs race track in Louisville, Kentucky. This race has been called "The Most Exciting Two Minutes in Sports" because that's about how long it takes for a horse to run its -mile track. Consider the winning time (in seconds) for every year since 1896. a. For predicting winning time from year, the least squares line has intercept coefficient and slope coefficient seconds per year. Use this information to report the equation of the least squares line, using good statistical notation. Use "year" for the name of the predictor variable. b. Based on the following residual plot, comment on whether a straight line appears to be a reasonable model for the relationship between winning time and year. A straight line seconds the best model for this relationship. Click if you would like to Show Work for this question: Open Show Work Draft Lottery The United States Selective Service conducted a lottery to decide which young men would be drafted into the armed forces (Fienberg, 1971). Each of the birthdays of the year was assigned a draft number. Young men born on days assigned low draft numbers were drafted. The correlation coefficient between draft number and sequential birthday for the 1970 lottery was . a. Determine the probability that a fair random lottery would produce a correlation coefficient so far from zero, simply by random chance. In other words, determine the -value for testing whether the population correlation coefficient equals zero, against a two-sided alternative. Report the test statistic as well as the -value. [Hint: There were numbers in this draft.] Round the test statistic to two decimal places. Use the the -value. If there is no lower limit on the no upper limit on the The test statistic is -value, enter -distribution table to find a range for -value, enter for the upper limit. for the lower limit. If there is -value The correlation coefficient between draft number and sequential birthday for the 1971 lottery was . b. Repeat part a for the 1971 lottery. Round your answer for the test statistic to two decimal places. Use the range for the -value. If there is no lower limit on the there is no upper limit on the -value, enter -distribution table to find a -value, enter for the lower limit. If for the upper limit. The test statistic is -value c. Summarize what these -values reveal about the fairness of these two lotteries. We for the 1970 draft process. The 1970 draft process was . We for the 1971 draft process. The 1971 draft process was . Click if you would like to Show Work for this question: Open Show Work Suicides Many songs associate Mondays with having the blues, but are people really more likely to commit suicide on Mondays? A recent study (Kposowa et al., 2009) analyzed data on suicides in the United States between the years 2000 and 2004. They classified each suicide according to the day of the week on which it occurred. Their data produced the following sample percentages: A news article that described this study did not report the sample size. Suppose for now that the sample size was . The computer output below presents results of a chi-square goodness-of-fit test of equal proportions: a. Explain how the expected counts were calculated. The sample size is divided by percentage of suicides for each day. Seven is divided by the sample size. The percentage of suicides for each day is divided by the sample size. The sample size is divided by seven. The percentage of suicides for each day is multiplied by the sample size. SHOW HINT b. Choose how the chi-square contribution for Monday was calculated. SHOW HINT c. Interpret what the The -value means in this context. -value is the probability of obtaining sample counts that result in a small chi- square test statistic, assuming that the probability of a suicide on each of the seven weekdays is not the same. The -value is the probability of obtaining sample counts that result in a small chi- square test statistic, assuming that the probability of a suicide on each of the seven weekdays is the same. The -value is the probability of obtaining sample counts that result in a large chi- square test statistic, assuming that the probability of a suicide on each of the seven weekdays is not the same. The -value is the probability of obtaining sample counts that result in a large chi- square test statistic, assuming that the probability of a suicide on each of the seven weekdays is the same. SHOW HINT d. Summarize the conclusion that you would draw from this -value. You have no statistical evidence that the probability of a suicide occurring on any of the seven weekdays is not the same. You have weak statistical evidence that the probability of a suicide occurring on any of the seven weekdays is not the same. You have strong statistical evidence that the probability of a suicide occurring on any of the seven weekdays is not the same. e. Which day appears to be especially likely for suicides? Monda y Tuesday Wednesda y Thursday Friday Saturday Sunday SHOW HINT f. If the sample size were larger, and the sample percentages stayed the same, how would the -value change? The The The Study of Tamoxifen and Raloxifene (STAR) enrolled more than postmenopausal women who were at increased risk for breast cancer. Women were randomly assigned to receive one of the drugs (tamoxifen or raloxifene) daily for five years. During the course of the study, researchers kept track of which women developed invasive breast cancer and which women did not. The initial results released in April 2006 revealed that of women in the tamoxifen group had developed invasive breast cancer, compared to of the women in the raloxifene group. a. Organize this information into a Tamoxifen Breast Cancer Raloxifene table. Total -value would decrease. The Preventing Breast Cancer -value would increase. -value would stay the same. No Breast Cancer Total b. Calculate the expected counts, under the null hypothesis that the population proportions of women who would develop breast cancer are the same with both treatments. Round your answers to three decimal places. Tamoxifen Raloxifene Breast Cancer No Breast Cancer c. Conduct a chi-square test on these data. Report the hypotheses, test statistic, and -value. Summarize your conclusion. Round the test statistic to three decimal places. Use the chi-square distribution table to find a range for the -value. If there is no lower limit on the -value, enter If there is no upper limit on the enter for the lower limit. -value, for the upper limit. The test statistic is . -value We . d. Suppose someone else decided to perform a two-sample -test and found that and -value Comment on how this compares to the test statistic and -value you just calculated. The test statistic of the chi-square test is roughly the test statistic of the two-sample -test. Click if you would like to Show Work for this question: Open Show Work