Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Data set also available in R library(wooldridge) data(k401ksubs) Consider the data set k401ksubs.csv posted with this assignment. It includes information on 9275 individuals with the

Data set also available in R

library(wooldridge)

data("k401ksubs")

Consider the data set "k401ksubs.csv" posted with this assignment. It includes information on 9275 individuals with the following covariates, where the dependent variable is pira, equal to 1 if the subject has an IRA.

e401k: =1 if eligible for 401(k)

inc: annual income, $1000s

marr: =1 if married

male: =1 if male respondent

age: in years

fsize: family size

nettfa: net total fin. assets, $1000

p401k: =1 if participate in 401(k)

pira: = 1 if have IRA (Individual Retirement Account)

incsq: income squared

agesq: age squared

Question 3

Create 2 Logistic Regression Models

(1) using all variables (Model 1),

(2) using variables you deem important (Model 2)

a. In Model 1 interpret the impact of e401k, nettfa and marr on the odds of participation even if they are not statistically significant.

b. Explain how you reached Model 2.

c. Discuss which model is a better model in explaining the variability in the probability of participation.

d. Predict the probability of participation for the first 10 observations in the data set.

e. Compare the predictive accuracy of the two models you built in Q3 using 10-fold cross-validation. Discuss which model you would pick as a predictive model. Clearly state what measures you are using to pick your model and why?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

An Introduction To Categorical Data Analysis

Authors: Alan Agresti

2nd Edition

0470653205, 9780470653203

More Books

Students also viewed these Mathematics questions

Question

Under what conditions are two qualitative variables independent?

Answered: 1 week ago

Question

4. What means will you use to achieve these values?

Answered: 1 week ago