Question

1 Approved Answer

Posted on May 17, 2024

Instructions: (Codes need to be in R) Two emails were sent to existing customers and data was gathered for multiple customer factors and marketing metrics.

Instructions:

(Codes need to be in R)

Two emails were sent to existing customers and data was gathered for multiple customer factors and marketing metrics. Existing customers were separated into 3 groups: Email A, Email B and control.

The data contains the following variables: user_id: unique customer identifier cpgn_id: campaign label(all the same for this question) group: customer can belong to "email_A", "email_B" or "ctrl" email: logical variable if the customer received either email open: binary variable of whether the customer opened email (1=open, 0=not open) click: binary variable of whether the customer clicked through to website (1=clicked to website, 0= not clicked) purch: amount customer purchased within 30 days of email campaign chard: lifetime purchases of chardonnay sav_blanc: lifetime purchases of sauvignon blanc syrah: lifetime purchases of syrah cab: lifetime purchases of cabernet sauvignon past_purch: lifetime sum of customer's past purchases days_since: days since last purchase visits: lifetime visits to site

Note: this material was adapted from the AB testing advanced module found in the Marketing Modules.

data:

data<-read.csv(file="../resource/asnlib/publicdata/ab_wine_data.csv")

Questions:

a. The first step is to evaluate whether groups were properly randomized. One way to do this is to group the data by their group type (email_A, email_B and ctrl) and compare baseline variables that are not part of the experiment. Group the data by 'group' variables and compare the means of lifetime purchases of the four types of wine tracked (chard, sav_blanc, syrah, cab).Does it appear that the data was properly randomized?

b. We now need to confirm that we have an appropriate sample size. We will confirm that we have enough data to compare proportions of open rate. What is the n for our groups (email_A, email_B)?

c. We want to confirm our sample size is sufficient using a power calculation for proportions. The function for this is:

power.prop.test(n = NULL, p1 = NULL, p2 = NULL, sig.level = 0.05, power = 0.80)

We want to calculate n, so leave that parameter as NULL. Use the open rate of email_A and email_B as the estimated proportions p1 and p2. Round proportions to 4 digits to match solution. What is the minimum sample size to detect this expected proportion difference? Is our sample size sufficient to detect the expected difference in open rates of our two emails?

d. We now know that the data we have collected is sufficient to answer the question whether there is a difference between the open rate of email A and email B. Use the following function to evaluate whether there is a statistical difference between the open rates of email A and email B:

prop.test()

Hint: If using the method shown in the advanced module, xtabs() objects must be put into a variable before being used in the prop.test function. Hint: Confirm the proportion estimates obtained in the prop.test() function match the proportions you calculated manually.

If the confidence interval does not include 0, then the open rate of the two emails is statistically significant.

Is there a significant difference between the open rates of the two tested emails?

e. Complete steps in c) and d) for the click through rate data. Is there sufficient data to evaluate the click through rate of email A and B?

f. Is there a statistical difference between the click through rates of email A and B?