Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

We considered experiment data to evaluate the effect of various factors on individual's perceived health. We fit different regression models, using the following variables: Health

We considered experiment data to evaluate the effect of various factors on individual's perceived health. We fit different regression models, using the following variables:

Health status (1,2,3,4,5), age, exercise (if they exercise any=1, otherwise=0), smoke100 (did they smoke more than 100=1, otherwise=0) and gender (m=male, f=female),

> fit<- glm(Healthstatus ~ smoke100 + gender + age , family=poisson (link=log), data=cdc)

> summary(fit)

Call:

glm(formula = Healthstatus~smoke100 + gender + age, family=poisson(link=log), data = cdc)

Deviance Residuals:

Min1QMedian3QMax

-1.86501-0.371910.060750.403441.13194

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept)1.47110430.0112173 131.146< 2e-16 ***

smoke100-0.06191990.0075249-8.229< 2e-16 ***

genderm0.01997520.00745752.6790.00739 **

age-0.00356090.0002203 -16.162< 2e-16 ***

---

Signif. codes:0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 6648.1on 19999degrees of freedom

Residual deviance: 6273.7on 19996degrees of freedom

AIC: 68870

Using above R-output, answer the following questions:

a. In one sentence explain why we choose to use Poisson regression instead of logistic regression. [2Marks]

b. State the prediction equation and interpret the coefficient for smoke100). [2Marks]

c. Find 99% Wald confidence interval for (gender). [2 Marks]

d. Using this model with three main effects, predict the health status for a female, at age of 52 who smoked more than 100 cigarettes and interpret. [3 Marks]

e. Your project manager does not wish to use Wald tests to decide which variable to remove, and prefers to use difference in deviance (likelihood ratio tests). She prepares the following summary after fitting all three two-variable models in addition to your three-variable model:

Model

Deviance

difference compared to full model

P-value

Full model

6273.7

smok100 + age

6380.9

17.2

0.102

smoke100 + gender

6537.6

63.9

0.012

age + gender

6341.6

8.9

0.260

Simply by considering the description of the 3 variables, explain if you suspect multicollinearity to be an issue for this problem? [2Marks]

By examining the difference in deviance between each of the subset models and the full model, explain which two-variable model(s) you will choose for consideration and which two-variable model(s) you will discard. [2Marks]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Precalculus Enhanced With Graphing Utilities (Subscription)

Authors: Michael, Michael Sullivan III, Michael III Sullivan, III Sullivan

6th Edition

0321849108, 9780321849106

More Books

Students also viewed these Mathematics questions

Question

7. Define cultural space.

Answered: 1 week ago

Question

8. Describe how cultural spaces are formed.

Answered: 1 week ago