Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Q SCI 381: Introduction to Probability and Statistics Winter 2022 Laboratory #8 (70 points) In today's lab, we will be using the file mortality.csv that

Q SCI 381: Introduction to Probability and Statistics

Winter 2022

Laboratory #8 (70 points)

In today's lab, we will be using the file mortality.csv that is available in Canvas (located in Files\Lab Datasets). The data contained in this file are from a study in which researchers were studying factors associated with mortality rates in 60 urban areas in the United States; we will be using a subset of the original dataset.

The data below is mortality.csv:

Elderly Poverty NO SO2 Mort.Rate
8.1 11.7 15 59 921.87
11.1 14.4 10 39 997.875
10.4 12.4 6 33 962.354
6.5 20.6 8 24 982.291
7.6 14.3 38 206 1071.289
7.7 25.5 32 72 1030.38
10.9 11.3 32 62 934.7
9.3 10.5 4 4 899.529
9 12.6 12 37 1001.902
9.5 13.2 7 20 912.347
7.7 24.2 8 27 1017.613
8.6 10.7 63 278 1024.885
9.2 15.1 26 146 970.467
8.8 11.4 21 64 985.95
8 13.9 9 15 958.839
7.1 16.1 1 1 860.101
7.5 12 4 16 936.234
8.2 12.7 8 28 871.766
7.2 13.6 35 124 959.221
6.5 12.4 4 11 941.181
7.3 18.5 1 1 891.708
9 12.3 3 10 871.338
6.1 19.5 3 5 971.122
9 9.5 3 10 887.466
5.6 17.9 5 1 952.529
8.7 13.2 7 33 968.665
9.2 13.9 4 4 919.729
10.1 12 7 32 844.053
9.2 12.3 319 130 861.833
8.3 17.7 37 193 989.265
7.3 26.4 10 34 1006.49
10 22.4 1 1 861.439
8.8 9.4 23 125 929.15
9.2 9.8 11 26 857.622
8.3 24.1 14 78 961.009
10.2 12.2 3 8 923.234
7.4 24.2 17 1 1113.156
9.7 12.4 26 108 994.648
9.1 13.2 32 161 1015.023
9.5 13.8 59 263 991.29
11.3 13.5 21 44 893.991
10.7 15.7 4 18 938.5
11.2 14.1 11 89 946.185
8.2 17.5 9 48 1025.502
10.9 10.8 4 18 874.281
9.3 15.3 15 68 953.56
7.3 14 66 20 839.709
9.2 12 171 86 911.701
7 9.7 32 3 790.733
9.6 10.1 7 20 899.264
10.6 12.3 4 20 904.155
9.8 11.1 5 25 950.672
9.3 13.6 7 25 972.464
11.3 13.5 2 11 912.202
6.2 10.3 28 102 967.803
7 13.2 2 1 823.764
7.7 10.9 11 42 1003.502
11.8 14 3 8 895.696
9.7 14.5 8 49 911.817
8.9 13 13 39 954.442

The file contains five columns of data collected from each of the 60 urban areas. The columns include:

Elderly: % population aged 65 or older

Poverty: % of families with an income below the poverty level

NO: a measure of the levels of nitric oxides

SO2: a measure of the levels of sulphur dioxide

Mort.Rate: the mortality rate per 100,000 residents

We will be using multiple regression to measure the effect of the predictor variables elderly, poverty, NO, and SO2 on the response variable, Mort.Rate.

(1) Before conducting any regression analyses, let's explore the dataset by a plotting pair-wise scatterplot using the plot() command (recall your code from Lab 7). Add a color of your choice and paste your plot below. (4 points)

(2) In your pair-wise scatterplot, which, if any, predictor variables appear to be correlated with Mort.Rate? Which, if any, predictor variables do not appear to be correlated with Mort.Rate? (4 points)

(3) Multiple linear regression assumes that the response variable is normality distributed. Plot a histogram of the response variable, Mort.Rate. Include a title and a color of your choice, and paste your histogram below. (8 points)

(4) What is the mean and median of Mort.Rate? Based on a visual assessment of your histogram in (3), and your estimates of the mean and median, do you conclude that Mort.Rate is normally distributed? Why or why not? (6 points)

(5) Regardless of your conclusion in (4), let's assume Mort.Rate is normally distributed, and let's use multiple regression to determine which, if any, of the predictor variables can be used to statistically predict Mort.Rate. First, run a multiple regression to determine if Elderly and Poverty predict Mort.Rate. Paste your code and output below. (4 points)

(6) Test the following null hypotheses using alpha=0.05, and indicate your statistical conclusion and interpretation for each. (8 points)

Ho: slope for Elderly = 0

Ho: slope for Poverty = 0

(7) Based on your conclusions in (6), run another multiple regression analysis by including any significant predictors from (6) and the predictor variable NO. Paste your code and output below. (4 points)

(8) Based on your output in (7), which, if any, predictor variables significantly predict Mort.Rate when using alpha=0.05. (i.e., test the null hypotheses, Ho: slope for predictor variable = 0). (8 points)

(9) Based on your conclusions in (8), run another multiple regression analysis by including any significant predictors from (8) and the predictor variable SO2. Paste your code and output below. (4 points)

(10) Based on your output from (9), which, if any, predictor variables significantly predict Mort.Rate when using alpha=0.05. (i.e., test the null hypotheses, Ho: slope for predictor variable = 0). (8 points)

(11) Examine the output from (9) and your conclusion from (10). What is the final regression equation? (8 points)

(12) Using your output from your final multiple regression model in (9), how much of the variation in Mort.Rate is explained by the significant predictor variables in this final regression model? (4 points)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Applied Regression Analysis

Authors: Norman R Draper, Harry Smith

3rd Edition

1118625625, 9781118625620

More Books

Students also viewed these Mathematics questions