Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

A/B Testing HW Article Link Here: https://www.dropbox.com/s/jqq1xq1f9nf7354/HW3_recycling.pdf?dl=0 Problem 1 - Interpreting Regression Results: Recycling [26 points] Entrepreneurs introducing new technologies often have a hard time

A/B Testing HW

Article Link Here: https://www.dropbox.com/s/jqq1xq1f9nf7354/HW3_recycling.pdf?dl=0

Problem 1 - Interpreting Regression Results: Recycling [26 points] Entrepreneurs introducing new technologies often have a hard time driving adoption. Entrepreneurs often try to drive adoption of their technologies by stressing their social benefits. Household recycling is a technology that seems amenable to this approach, as its social benefits are especially clear. An organization in Peru conducted an experiment to encourage households to recycle more. The results are described in this article. This problem will focus on interpreting the results reported in the article. This problem focuses just on Table 4 of that article and does not require downloading any data. The paper contains two experiments, a "participation study" and a "participation intensity study." In this problem, we will focus on the latter study, whose results are contained in Table 4 in the paper, which is on page 20. You may need to read the relevant section of the paper (starting at the bottom of page 17 of the manuscript until the bottom of page 19) in order to understand the experimental design and variables. (Note that "indicator variable" is a synonym for "dummy variable," in case you haven't seen this language before, which means a variable that always either takes a value of 0 or 1.) In Table 4, each column gives the results of a separate regression where the authors analyze the effects of the same treatments on different outcome variables. The header row (heading of each column) tells you which outcome variable they are using in the regression reported in that column. The header column (the column to the left to column 1) lists out all the independent variables that the authors included on the right hand side of regression equations. Each coefficient estimate consists of two parts in the table: the estimate itself with stars indicating its statistical significance, and the standard error of the estimates reported in the parentheses right underneath the estimates. Note that they also never report the intercept of the regression because they don't think it is meaningful. Instead, they write "mean of dependent variable," since they think that is more useful to know than what the intercept tells you (the expected value of the outcome when all the covariates are zero). "Dependent variable" is a synonym for the outcome. a. In Column 3 of Table 4A, what is the estimated ATE of providing a recycling bin on the average weight of recyclables turned in per household per week, during the six-week treatment period? Provide a 95% confidence interval. [5 points] b. In Column 3 of Table 4A, what is the estimated ATE of sending a text message reminder on the average weight of recyclables turned in per household per week? Provide a 95% confidence interval. [5 points]

c. Which outcome measures in Table 4A show statistically significant effects (at least the 5% level) of providing a recycling bin? [4 points] d. Which outcome measures in Table 4A show statistically significant effects (at least the 5% level) of sending text messages? [4 points] e. Suppose that the covariate "percentage of visits turned in bag, baseline" had been left out of the regression reported in Column 1. [Note: by "baseline," they mean, "before the experiment started." So, "percentage of visits turned in bag, baseline" just means "percentage of weeks in which the household turned in a bag in the weeks before the treatment began." The outcome variable in this regression, by contrast, is the same measure, but collected after the experiment started.] What would you expect to happen to the results on providing a recycling bin? In particular: Would you expect an increase, decrease, or no change in the estimated ATE? Would you expect an increase or decrease in the standard error? Explain your reasoning. [4 points] f. In column 1 of Table 4A, would you say the variable "has cell phone," which was measured before the start of the experiment, is a "bad control" that should not be included in the regression? Explain your reasoning. [4 points] Problem 2 - Using Regression to Measure The Impact of a "Work From Home" Policy on Employee Productivity [32 points] Managers fiercely debate whether to allow employees to work from home. For example, Marissa Mayer of Yahoo! touched off a firestorm when she famously reversed Yahoo!'s work from home policy and forced all Yahoo! employees to come into work every day. Employees argued that they would be more productive from home, but data from other workplaces show that workers are sometimes dishonest about how much work they are actually getting done while working from home. Meanwhile, The New York Times editorial board continues to run data-free articles that opine on both sides of the work from home debate. In debates like these about firm management practices, rigorously measuring impact can cut through reasonable points made by both sides and begin to uncover the truth. Stanford Economist Nick Bloom and his colleagues worked with a call center firm in China to measure the impact of a new "Work from Home" (WFH) policy the firm implemented. The results are reported in an article that appeared in the Quarterly Journal of Economics. (You do not need to read that article.) Bloom and colleagues ran a 9-month long randomized controlled trial with a Chinese travel agency called "Ctrip" () and evaluated the impact of WFH on employee performance and productivity. In this question, we will analyze the impact of the firm's WFH policy. You can download a version of the data that I've cleaned and over-simplified here. In the data: wfh = the treatment indicator; 1 = randomly assigned to work from home; 0 = randomly assigned to have to come to the office perform_during = "the number of phone calls answered and number of orders taken", after the experiment began. This is the outcome. This is observed for both the treatment and the control group; the control group's working conditions did not change, but we track of their productivity after the experiment started, too. perform_before = "the number of phone calls answered and number of orders taken", the same measure as above, but recorded for a fixed time period before the experiment

began (whereas perform_during is the same but after the treatment started / experiment began).1 a. Estimate the ATE of wfh on perform_during using regression but without using the variable perform_before. Provide a 95% confidence interval. [6 points] b. Estimate the ATE of wfh on perform_during, now also adding the variable perform_before as a covariate to increase the precision of our ATE estimate. What is the new 95% confidence interval? [6 points] c. Compute a new variable called before_after_difference defined as perform_during - perform_before. Run a regression to estimate the ATE of wfh on before_after_difference, not using any other variables. What is the new 95% confidence interval? [6 points] d. Between the regressions you ran in parts a, b, and c, which is the most precise (that is, which has the smallest standard error on the treatment effect estimate)? Why do you think this regression is the most precise? [7 points] e. Perform a randomization check by estimating the ATE of the WFH policy on the number of phone calls answered and numbers of orders taken before the experiment started. What do the results indicate to you about the validity of the study's conclusion? [7 points] Problem 3 - Using Regression to Measure The Impact of Social Pressure on Voter Turnout [34 points] Does shaming people into doing good work? Or does it just cause people to react negatively, even causing them to do the opposite? In the United States, voting is considered a social norm, such that when people learn that others have not voted, they judge them more negatively -- just as people judge others more negatively when they learn they don't recycle or don't pay their taxes. Knowing they could be negatively judged in this way, do many people vote in order to avoid the shame of being a nonvoter? And, can social innovators use this social norm to their benefit? A study by Alan Gerber and Don Green (you do not need to read it) examined this question. They obtained the publicly available voter rolls from the Michigan Secretary of State and randomly assigned voters to one of five conditions ahead of the 2006 primary election. One condition was a control group that received no mailing. The other four conditions got one of the four mailings given here. The last of the four, the "Neighbors" mailing, contained a list of the voter turnout for all residents of one's own household and for all the other households on a street. Please download a cleaned version of the data here. 1 Both of these variables are standardized. You don't need to know what this means, but here is what it means: This means the authors have taken the variable, divided it by its standard deviation, and subtracted its mean. For example, suppose a call center worker took 200 orders but the average is 150 and the standard deviation is 100. This call center worker would get a score of (200-150)/100 = 0.5, reflecting that she is "0.5 standard deviations better than average." This is not important to understand or know in order to answer the question.

a. Multiple voters often live within the same household, but it would not be feasible to randomly assign different voters who live in the same households to different mailings; they would show the mailings to each other (creating an issue we will discuss in a future week). Therefore, the authors randomly assigned the data at the household level, such that all people within a household will always get the same treatment. This is an example of clustering. Should you cluster standard errors when estimating ATE? At which level? [5 points] b. Estimate the ATE of all four treatments (treatment_civicduty, etc.) on the outcome variable, voted, using regression. Do not use any controls and do not worry about clustering standard errors yet. [5 points] c. Now let's correctly cluster the standard errors to account for the fact that it was randomized at the household level. Take the regression you ran in part (b) and take the clustering into account. R, delightfully, does not have a built-in function for computing clustered standard errors. You need to insert the code below, and then call the function in that code using cl(output.from.lm.here, cluster.variable.here). Report the standard errors for the estimated effects of the four treatments.[5 points] cl <- function(fm, cluster){ require(sandwich, quietly = TRUE) require(lmtest, quietly = TRUE) M <- length(unique(cluster)) N <- length(cluster) K <- fm$rank dfc <- (M/(M-1))*((N-1)/(N-K)) uj <- apply(estfun(fm),2, function(x) tapply(x, cluster, sum)); vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N) coeftest(fm, vcovCL) } # To use CL; first specify your model, like this: lm.result <- lm(y~x, data) # then use this to return the clustered standard errors: cl(lm.result, data$rename.this.to.the.variable.that.identifies.clusters) # To install the proper packages, you may need to run this line once on your computer in the "Console" tab of RStudio: install.packages(c('sandwich', 'lmtest')) d. Run the code from part c), but now add controls for whether someone voted in previous elections, the variables called g2000, g2002, p2000, p2002, p2004 (p is for primary; g is for general; year is the year of the election), and the variables their age (denoted yob for year of birth) and gender. Report the standard errors for the estimated effects of the four treatments. [5 points] e. Why do the standard errors change in part d)? [4 points] f. Given the content of the four mailings and the results of your regressions, what do you conclude about the efficacy of social pressure for increasing voter turnout? [4 points] g. The voted variable records whether someone voted in the 2006 primary election in August 2006. The mailers were sent in July 2006. Imagine the authors also had data on whether these voters later voted in the 2006 general election that took place three months later, in November 2006. Would you recommend adding November 2006 as an

additional covariate to the regression estimating the effect of the mailers on turnout in August 2006 you ran in part d)? Why or why not? [3 points] h. Imagine you were interested in the effect of sending these mailers in July 2006 on voter turnout in November 2006 and so ran a regression with November 2006 turnout as the outcome. Would you add voter turnout in August 2006 to this regression? Why or why not? [3 points] Problem 4. - Statistical Power, Blocking, Clustering [8 points] In planning an impact measurement, one important question is whether the measurement will have sufficient statistical power to be informative. As a corollary, as someone trained in how to measure impact, you should be ready to notice aspects of an impact measurement that will decrease statistical power, and be able to spot potential strategies for increasing statistical power. This will allow you to avoid drawing erroneous conclusions and wasting time -- or to learn much more than you originally thought possible. In this question, we will think through the intuition behind statistical power and how sample size, blocking, and clustering relate to power. As a reminder, statistical power refers to the probability of rejecting the null hypothesis in the presence of a true treatment effect. Part a [4 points] Suppose you want to evaluate the impact of a new team performance management system at your company, the "OKR" system used by Google, Uber, LinkedIn and others. To track the impact of this system, you will measure individual-level employee outcomes, such as the number of hours employees spend working, their job satisfaction, and other such metrics. The lead of your data science team has prepared an experiment: managers together with their entire teams of employees will be randomly assigned to either use the OKR system or not over the next quarter. The data science lead has said, given that you have 1,000 employees, the experiment will be sufficiently well-powered. However, you know that you only have 50 managers/teams at your company, and that entire managers/teams always need to be in the same group (e.g., it's not possible for half of a manager's team to the in the treatment group and for half to not). State whether this is an example of clustering or blocking and explain its effect on the statistical power in plain language. Part b [4 points] Suppose we want to evaluate the impact of a new drug your pharmaceutical company has developed on whether someone is cured of a rare disease. Previous data has told you that women are much more likely to recover from this rare disease of their own accord than are men, only a few of whom are able to fight the disease naturally. Could blocking or clustering be used to increase the power of this experiment? If so, how would you implement it?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction to Probability

Authors: Mark Daniel Ward, Ellen Gundlach

1st edition

716771098, 978-1319060893, 1319060897, 978-0716771098

More Books

Students also viewed these Mathematics questions

Question

What is the difference between ArrayList and LinkedList?

Answered: 1 week ago