Answered step by step
Verified Expert Solution
Question
1 Approved Answer
We have discussed various approaches to estimating treatment effects under the Conditional Independence Assumption using observational data. In this section you will design and execute
We have discussed various approaches to estimating treatment effects under the Conditional Independence Assumption using observational data. In this section you will design and execute a simulation that will allow you to compare of the mean and variance of five potential estimators. You should be able to borrow from Seminar 1's STATA do-file (or R-script). However, the set up has been adapted in a number of ways. Please follow the steps carefully and report back any ambiguities. - Set the number of observations to 1000 , as observational datasets tend to be larger. - Generate the data according to the data generating process, Yi(0)=0+1agei+2femalei+3ageifemalei+i where, [0,1,2,3]=[1.2,0.015,0.02,0.01] Let, - Generate iN(0,0.552) - Generate age ei as a random integer that is uniformly distributed between [20,65]. declines rapidly during years of childbirth, female iB(1,( age i)) where ( age i)=0.50.25ln(46)ln(agei19) - Allow for heterogeneous treatment effects by age, i(agei)N((agei),0.01)where(agei)=0.02+0.061{agei>43} Since age is uniformly distributed, ATE0.05. - Simulate unconfoundedness on age, alone. Assign treatment status in such a way that the average level of treatment increases with age, but is independent of Yi(0) conditional on age. Use the binomial distribution where the probability of success depends on age in the following way, WiB(1,(a2gei))where(agei)=0.25+0.5ln(46)ln(agei19) The probability of treatment should be between [0.25,0.75]. - Estimate the following five models 1,000 times. Report the mean and standard deviation of the simulated samples of ^1. Provide a plot of the kernel density distribution of each estimator. - Model 1: Estimate a linear regression model without covariates, Yiobs=01+11Di+vi - Model 2: Estimate a linear regression model that matches the CEF of Yiobs. Yiobs=02+12Di+12agei+22femalei+32ageifemalei+i - Model 3: Estimate a saturated linear regression model, Yiobs=13Di+j=2065j31{agei=j}+i - Model 4: Estimate Model 1, applying inverse probability weights, Yiobs=04+14Di+vi where the estimated weights are based on the estimated propensity scores, ^i=e^(Xi)Di(1e^(Xi))1Di1 derived from a logit model, e(Xi)=Pr(Di=1agei)=(04+14agei) - Model 5: Repeat model 4, but use a saturated logit model, e(Xi)=Pr(Di=1agei)=(j=2065j51{agei=j}) ISCUSSION: 1. In addition to reporting on the distributions of these five estimators, provide a discussion of the simulation results. In particular, discuss how important model specification appears to be relative to omitted variable bias (or selection on unobservables). We have discussed various approaches to estimating treatment effects under the Conditional Independence Assumption using observational data. In this section you will design and execute a simulation that will allow you to compare of the mean and variance of five potential estimators. You should be able to borrow from Seminar 1's STATA do-file (or R-script). However, the set up has been adapted in a number of ways. Please follow the steps carefully and report back any ambiguities. - Set the number of observations to 1000 , as observational datasets tend to be larger. - Generate the data according to the data generating process, Yi(0)=0+1agei+2femalei+3ageifemalei+i where, [0,1,2,3]=[1.2,0.015,0.02,0.01] Let, - Generate iN(0,0.552) - Generate age ei as a random integer that is uniformly distributed between [20,65]. declines rapidly during years of childbirth, female iB(1,( age i)) where ( age i)=0.50.25ln(46)ln(agei19) - Allow for heterogeneous treatment effects by age, i(agei)N((agei),0.01)where(agei)=0.02+0.061{agei>43} Since age is uniformly distributed, ATE0.05. - Simulate unconfoundedness on age, alone. Assign treatment status in such a way that the average level of treatment increases with age, but is independent of Yi(0) conditional on age. Use the binomial distribution where the probability of success depends on age in the following way, WiB(1,(a2gei))where(agei)=0.25+0.5ln(46)ln(agei19) The probability of treatment should be between [0.25,0.75]. - Estimate the following five models 1,000 times. Report the mean and standard deviation of the simulated samples of ^1. Provide a plot of the kernel density distribution of each estimator. - Model 1: Estimate a linear regression model without covariates, Yiobs=01+11Di+vi - Model 2: Estimate a linear regression model that matches the CEF of Yiobs. Yiobs=02+12Di+12agei+22femalei+32ageifemalei+i - Model 3: Estimate a saturated linear regression model, Yiobs=13Di+j=2065j31{agei=j}+i - Model 4: Estimate Model 1, applying inverse probability weights, Yiobs=04+14Di+vi where the estimated weights are based on the estimated propensity scores, ^i=e^(Xi)Di(1e^(Xi))1Di1 derived from a logit model, e(Xi)=Pr(Di=1agei)=(04+14agei) - Model 5: Repeat model 4, but use a saturated logit model, e(Xi)=Pr(Di=1agei)=(j=2065j51{agei=j}) ISCUSSION: 1. In addition to reporting on the distributions of these five estimators, provide a discussion of the simulation results. In particular, discuss how important model specification appears to be relative to omitted variable bias (or selection on unobservables)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started