Question

1 Approved Answer

Posted on Jul 15, 2024

For this assignment, you will be performing the type of business analysis typical of many corporate positions today. You will be analyzing two files containing

For this assignment, you will be performing the type of business analysis typical of many corporate positions today. You will be analyzing two files containing customer data from the fictitious online retail company Wishbone. The first file, orders.xlsx , shows a listing of all past customers and whether each customer ordered from the company in the last month a value of 1 indicates that they did place an order in the last month, and a value of 0 indicates that they did not. The second file, approval.xlsx , shows a customer rating of how likely they are to recommend the site to others, on a sliding scale of 0, for not likely at all, to 100, certain to recommend the site (decimal values are possible). In each file, the first tab shows raw population data, while the other tabs show samples taken for data analysis. You may use Minitab or Statcrunch, or your statistical programming language of choice, as you prefer. Part I Experimental Design This section is about the design of experiments and statistical sampling. While you wont be required to simulate random sampling using technology, you should understand how it works and be able to design a theoretical experiment with random sampling. 1. Suppose you are tasked with analyzing customer loyalty for Wishbone, an online discount retail company. The company has over 10,000 customers, with over 400,000 orders placed on the site. a. Comment on the difficulties involved in collecting information and statistically analyzing the customers data. b. Would a simple random sample be feasible in this situation? Why? 2. One idea suggested by the Wishbone representative is to simply add an optional survey asking about customer satisfaction that customers can choose whether or not to fill out after placing an order. a. What sampling method would this be considered? b. Would this be an effective way to collect customer data? Explain any challenges that it may create.

STA 2023 - Kirby Santa Fe College Lab 2 Sampling & Distributions 3. Another suggestion for collecting data is to send an email to each customer who has made an order in the past asking them to fill out a brief survey. a. Would this be an effective form of data collection? Why or why not? b. What type of bias would this method produce? 4. If you were in charge of collecting data on Wishbones customers, what method would you use to produce a randomized sample of a reasonable side for statistical analysis? Part II Data Processing & Analysis This section is about processing raw population data and sample data, and taking a first look at finding the probability of producing an observed set of results based on estimated or computer population parameters. 5. The first file, orders.xlsx , contains population data for all 12,000 of Wishbones previous customers in the first tab. The data shows a value of 1 if the customer placed an order in the last month, and a 0 if the customer did not place an order in the last month. a. What type of distribution would describe this data? Be sure to check all the requirements! b. Count the number of values of 1 and divide by the total number of values to find the population proportion, p, of customers who placed an order in the last month. c. Use the formula for the appropriate distribution to calculate the standard deviation of the population.STA 2023 - Kirby Santa Fe College Lab 2 Sampling & Distributions 3. Another suggestion for collecting data is to send an email to each customer who has made an order in the past asking them to fill out a brief survey. a. Would this be an effective form of data collection? Why or why not? b. What type of bias would this method produce? 4. If you were in charge of collecting data on Wishbones customers, what method would you use to produce a randomized sample of a reasonable side for statistical analysis? Part II Data Processing & Analysis This section is about processing raw population data and sample data, and taking a first look at finding the probability of producing an observed set of results based on estimated or computer population parameters. 5. The first file, orders.xlsx , contains population data for all 12,000 of Wishbones previous customers in the first tab. The data shows a value of 1 if the customer placed an order in the last month, and a 0 if the customer did not place an order in the last month. a. What type of distribution would describe this data? Be sure to check all the requirements! b. Count the number of values of 1 and divide by the total number of values to find the population proportion, p, of customers who placed an order in the last month. c. Use the formula for the appropriate distribution to calculate the standard deviation of the population.

STA 2023 - Kirby Santa Fe College Lab 2 Sampling & Distributions Tabs 2-6 of the file, Sample 1-5, contain 5 samples, each with a size of n=100, taken from the population. Sample 1 was selected using systematic sampling of every 120th customer; Sample 2 was taken using stratified sampling of new customers vs previous customers; Sample 3 was taken using stratified sampling of high value customers (>$20) vs low value customers (<$20); Sample 4 was taken using cluster sampling; and Sample 5 is a simple random sample. 6. For each Sample, count the number of customers who did place an order in the past month and divide by the sample size of 100 to find , the sample proportion. a. Sample 1: b. Sample 2: c. Sample 3: d. Sample 4: e. Sample 5: 7. Now, use the 5 sample proportions you found in the previous problem as your data; you may either enter them in a new dataset on Minitab/Statcrunch, or you may find it easier to simply enter the values in L1 on your calculator and compute 1-Var Stats. a. Calculate the mean value of the sample proportions. b. Calculate the population standard deviation, , for the sample proportions. How does this compare to the standard deviation of the original population? 8. Use the value of p computed in problem 5 as the population parameter p. a. What is the mean value of the sampling distribution, ? b. Calculate the value of the standard error of the sampling distributions with a sample size of = 100, that is, . How does this compare to the standard deviation of the different sample proportions you calculated in question 7b? 9. Now, examine Sample 1, which was computed using systematic sampling. a. Does this sample meet the requirements of the Central Limit Theorem for approximation by the normal distribution? List each requirement, and whether it is met by this sample.

STA 2023 - Kirby Santa Fe College Lab 2 Sampling & Distributions b. Now, using the mean and standard error for the sampling distribution you found in question 8, calculate the probability of finding a value as far from the mean as the sample proportion from Sample 1. (Hint: if < , calculate ( ); if > , calculate ( )) c. Does this result make sense, given the sample size and the population parameters you calculated? 10. The second file, approval.xlsx , contains customers selective approval rating when ordering on Wishbone, rated using a sliding scale from 0-100. The first tab contains population data for all 12,000 previous customers. a. Calculate the mean approval rating, , for the entire population. b. Calculate the standard deviation, , for the population. 11. The second tab contains results for a sample of 100 customers, selected at random to complete a survey that is required for customers to complete their checkout. a. Calculate the mean rating, , of the sample. b. Does this sample meet the requirements of the Central Limit Theorem? 12. Now, using the population mean and standard deviation found in question 10, compute the parameters of the sampling distribution. a. What is the mean of the sampling distribution, ? b. Use the appropriate formula for sampling distribution of a sample mean to compute the standard error of the sampling distribution, . How does this compare to the population standard deviation you found in question 10b? 13. Using the mean and standard deviation values from question 12 as your population parameters, calculate the probability of finding a more extreme value for the sample mean than the value from 11a, . (Hint: if > , find ( ); if < , find ( ))

STA 2023 - Kirby Santa Fe College Lab 2 Sampling & Distributions Part III Analysis and Conclusion For this section, you will use the statistics computed in Part II to draw conclusions about the data and sampling methods used. 14. For each data set, you computed the probability of observing a sample mean when selecting a random sample from a population with known parameters. a. The probability you computed in 9b represents the likelihood of selecting a sample with a similar proportion of repeat customers from the overall population. Assuming a probability of 5% or greater is reasonably likely, are you surprised by the observed results? b. The probability you computed in 13 represents the likelihood of selecting a sample with a similar average customer rating from the overall population. Assuming a probability of 5% or greater is reasonably likely, are you surprised by the observed results? 15. Suppose for the sample approval ratings in the approval.xlsx file, the sample was collected only from customers placing an order within the last month, while the population mean was computed using a customer history going back two years. What conclusions can you draw about customer approval ratings in the previous month compared to those in the past two years?

ORDERS AND APPROVALS
	61.4386 88.19554 58.57588 53.83894 64.35469 100 96.80894 70.64281 65.41205 70.13285 65.96469 60.77502 43.14173 52.74737 100 65.41668 84.7776 7.572217 95.2761 66.61246 75.43381 68.72326 65.43109 78.73039 77.09972 83.76358 100 67.37557 100 58.05079 98.44043 47.45421 80.54154 88.7042 70.94033 100 60.216 77.56998 100 89.34736 94.93114 85.03039 31.26567 82.11502 82.27139 76.07233 97.13504 83.58561 81.15017 44.41449 95.33526 87.38172 78.28694 58.95392 94.93441 60.28431 80.65288 59.41638 51.22659 31.26894 90.90619 81.92644 97.80585 81.07529 67.09466 98.63468 89.13236 91.73533 88.85065 76.2537 88.95016 57.97279 100 54.36291 93.62517 91.06965 84.77233 65.99947 48.02788 100 53.16543 97.34672 65.19509 90.02607 100 77.85674 20.37922 73.59013 77.92719 100 87.00306 92.21017 73.49185 84.56567 62.77092 79.99134 69.25987 70.82993 91.3822 74.67498
	ORDERS 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 0 1 0 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1