Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

R script for data and plots! #creating data X QUESTION 3. Consider a situation in which we know that a linear model is a correct


R script for data and plots!

#creating data

X <- c(rnorm(10000, mean = 0, sd = 1 ))

D <- sample(c(1,0), size = 10000, replace = TRUE, prob= c(0.5,0.5))

U <- c(rnorm(1000, mean = 0, sd = 1))

Y <- X + D + U

data <- data.frame(X, D, Y)


# filter the results from control group and treatment group

data_treatetd <- data%>%filter(D>0)

data_control <- data%>%filter(D==0)


lm_treated <- lm(Y~X, data = data_treatetd)

lm_controled <- lm(Y~X, data= data_control)

summary(lm_treated)

summary(lm_controled)


#treated plot

ggplot(data_treatetd, aes(x=X, y=Y)) + ggtitle("Treatment group") + 

 xlab("Data") + ylab("Error term") + geom_smooth(methode = "lm", se = FALSE)


#controled plot

ggplot(data_control, aes(x=X, y=Y)) + ggtitle("Controled group") + 

 xlab("Data") + ylab("Error term") + geom_smooth(methode = "lm", se = FALSE)


# Scatterplot treated group

ggplot(data_treatetd, aes(x=X, y=Y)) + ggtitle("Treatment group") + 

 xlab("Data") + ylab("Error term") + geom_point(color = "blue") + geom_smooth(methode= "lm")


#scatterplot controled group

ggplot(data_control, aes(x=X, y=Y)) + ggtitle("Controled group") + 

 xlab("Data") + ylab("Error term") + geom_point(color = "red") + geom_smooth(methode= "lm")


ggplot(NULL, aes(X,Y))+

 geom_point(data = data_treatetd, aes(color ="Treatment"))+ geom_smooth(data = data_treatetd, methode= "lm")+

 geom_point(data = data_control, aes(color = "Control")) + geom_smooth(data = data_control, methode= "lm") 



( this is the question i am looking for an answer for)You just plotted two graphs. Explain why a simple difference in means between the treatment and control may not have given the correct answer about the effectiveness of the treatment while a regression on Yi on Di and Xi does.

 

QUESTION 3. Consider a situation in which we know that a linear model is a correct specification for the relationship of interest that includes three variables (Yi,,XiDi). Let Y; denote some dependent variable for individual i. Let Di denote the treatment variable. About half of the sample is treated. Di takes on the value of 1 if individual i is treated, and O otherwise. Let X denote a confounding variable for individual i. X, is a random variable with mean O and standard deviation 1. Parenthetically, do not forget to add an error term when you generate Y. The error term also has mean O and standard deviation 1. Use R to generate hypothetical data (10,000 observations) from such a model. Do the

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Computer organization and architecture designing for performance

Authors: william stallings

8th edition

136073735, 978-0136073734

More Books

Students also viewed these Databases questions

Question

Briefly define displacement addressing.

Answered: 1 week ago