Question
Now that you've learned about hypothesis testing and p-values, you should also be aware that these methods can be used incorrectly. Or, even worse, maliciously.
Now that you've learned about hypothesis testing and p-values, you should also be aware that these methods can be used incorrectly. Or, even worse, maliciously. Usually it involves manipulating the data or the test in such a way to produce a desired result. There's many methods for this, and they've got some cool names like p-hacking and data dredging. In this problem, we will focus on the idea of using subsets of data to find a desired result.
Nefarian just landed his first data science position as an intern at a new e-commerce company. His project was the design and test a new website layout that would lead to more purchases. To test his new layout, the company gathered four different groups of 50 customers and recorded how many of those ended up purchasing an item. This test was then repeated on multiple days. The effectiveness of Nefarian's layout is measured by the number of customers that made a purchase. This data is stored in the data frame purchases.
Nefarian wants to land a permanent position at the company after his internship is over, so he really wants to impress his supervisors with his new layout. He knows that the site has an average purchase rate of 0.8 and wants to see if his layout is an improvement.
Part A) Use the entire dataset to determine whether Nefarian's layout is an improvement over the original layout. Use an appropriate hypothesis test and a significance level of =0.05. Store the p-value for this test in the variable p3.a.
Part C) Bummer. But Nefarian really wants his design to be an improvement, so what's a little bad science? What if he can find a subset of data that supports his claim? Thinking back, Nefarian remembers that Group C supposedly contained some very impulsive customers. Using the same hypothesis from Part A, determine if Nafarian's layout was a statistically significant improvement at the =0.05 significance level, if he only looks at sampels from Group C. Save the p-value of this test as p3.c, rounded to three decimal places.
purchases <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("a", "b", "c", "d"), class = "factor"), num_purchases = c(36L, 42L, 41L, 40L, 36L, 42L, 36L, 35L, 39L, 39L, 44L, 42L, 43L, 39L, 41L, 38L, 40L, 38L, 33L, 41L, 38L, 36L, 42L, 39L, 43L, 42L, 41L, 46L, 41L, 37L, 41L, 40L, 39L, 40L, 43L, 37L, 39L, 38L, 43L, 38L, 41L, 37L, 39L, 38L, 40L, 40L, 38L, 45L, 40L, 38L, 39L, 40L, 37L, 41L, 42L, 44L, 44L, 41L, 40L, 39L, 41L, 36L, 42L, 40L, 41L, 39L, 42L, 40L, 38L, 44L, 37L, 41L, 37L, 41L, 41L, 40L, 36L, 37L, 41L, 38L, 45L, 48L, 47L, 48L, 48L, 47L, 49L, 47L, 49L, 49L, 49L, 49L, 50L, 47L, 46L, 46L, 46L, 48L, 48L, 46L, 47L, 47L, 48L, 49L, 43L, 47L, 49L, 49L, 48L, 45L, 47L, 44L, 47L, 48L, 48L, 49L, 50L, 47L, 49L, 48L, 39L, 33L, 40L, 40L, 43L, 38L, 40L, 40L, 42L, 42L, 39L, 40L, 44L, 45L, 39L, 36L, 39L, 40L, 40L, 34L, 40L, 39L, 39L, 42L, 42L, 38L, 40L, 43L, 38L, 43L, 37L, 39L, 40L, 41L, 40L, 40L, 43L, 40L, 44L, 42L )), class = "data.frame", row.names = c(NA, -160L))
p3.a = NA
p3.c = NA
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started