Answered step by step
Verified Expert Solution
Question
1 Approved Answer
OPRE 3360: Managerial Methods in Decision Making Under Uncertainty Graded Group Assignment 2: Hypothesis Testing and Statistical Inference This group assignment is open book and
OPRE 3360: Managerial Methods in Decision Making Under Uncertainty Graded Group Assignment 2: Hypothesis Testing and Statistical Inference This group assignment is open book and notes. However, once you start the problem set, you must do your own work within your study group and not give or accept help from anybody other than your group. The problem set covers the topics we learned in Module II (Classes 6-8) and will take a couple of hours to complete. Please spend as long as you require to do a good job, but use that estimated time as a measure of how well you might do on similar questions on the timed (and enforced) Test II. I expect a student who is well prepared will correctly solve about 80% to 90% of the problems. [Part A: Module II Problem Set] [Metacritic and Iron Man 3] Metacritic is a website that aggregates reviews of music, games, and movies. For each product, a numerical score is obtained from each review and the website posts the average score as well as individual reviews. The website is somewhat similar to Rotten Tomatoes, but Metacritic uses a different method of scoring that converts each review into score in 100-point scale. In addition to using the reviewer's quantitative ratings (stars, 10-point scale), Metacritic manually assesses the tone of the review before scoring. Historical data shows that these converted scores are normally distributed. One of the movies that the Metacritic rated was Iron-Man 3. The first review came out on April 22 (Robbie Collin from the Telegraph) and other reviews started to trickle in before the movie was released on May 3, 2013. The following data contains a sample of the media outlets, reviewers, scores and review dates for all reviews that Metacritic collected until May 1, 2013. Use this data set to answer questions 1 to 5. 1. [1 pt] What are the sample mean and sample standard deviation in this sample? (a) sample mean= 69.17, sample standard deviation = 14.86 (b) sample mean= 69.17, sample standard deviation = 3.034 (c) sample mean= 59.71, sample standard deviation =14.86 (d) sample mean= 69.17, sample standard deviation = 3.034 2. [1 pt] A 95% confidence interval for the true average score ( ) of Iron-Man 3 is: (a) [38.42, 99.91] (b) [43.71, 94.63] (c) [63.97, 74.36] (d) [62.89, 75.44] 1 3. [1 pt] A 90% confidence interval for the true average score ( ) of Iron-Man 3 is: (a) [38.42, 99.91] (b) [43.71, 94.63] (c) [63.97, 74.36] (d) [62.89, 75.44] 4. [1 pt] Between the 95% and 90% intervals, which one is wider and why? (a) The 90% interval because it is a less accurate interval. (b) The 95% interval because it is a more useful interval. (c) The 90% interval since, in order to decrease the confidence from the 95% to 90%, the margin of error should increase so that the intervals constructed using sampling will capture the true mean more often. (d) The 95% interval since, in order to increase the confidence from the 90% to 95%, the margin of error should increase so that the intervals constructed using sampling will capture the true mean more often. 5. [1 pt] Let denote the population proportion of reviewers who rate Iron Man 3 with score of 80 or higher. A 90% confidence interval for is: (a) [0.291, 0.626] (b) [0.374, 0.709] (c) [0.259, 0.658] (d) A confidence interval using the Central Limit Theorem should not be used because the sample size to build a confidence interval for the proportion is too small. 2 [Google's free Internet project] According to the Pew Internet & American Life Project, many American adults spend significant amount of time on the Internet every day (Pew Internet Project, 2008). Google has a project that provides free wireless Internet in a medium-sized city. Among several factors that affect Google's decision, one factor is the percentage of adult smartphone users who are actively using social network apps. Google picked a random sample of 400 residents in one of the candidate cities, Rich-Addison, and collected the following data. 6. [2 pt] Construct a 95% confidence interval for the true proportion of Rich-Addison residents who are actively using Facebook. 95% confidence interval: 7. [2 pt] If everything else remains the same, what is the margin of error in a 99% confidence interval? Margin of error: 8. [3 pts] Google wants to know what percentage of users is actively using all three medias - Facebook, Twitter, and an online chat/video tool (Skype, Facetime, etc.). What is 95% confidence interval for the true proportion of smartphone users who are actively using all three? 95% confidence interval: 3 9. [6 pts] Google wants to know if there is strong evidence that more than 40% of smartphone users actively use Twitter. Use the sample data to perform the hypothesis test at the 5% significance level. For full credits, you need to specify the null and alternative hypotheses, appropriate test- statistic value, pvalue and conclusion. Parameter to be tested: : : Test-statistics: p-value: Conclusion: [Should we build a wind farm?] Wind Power has been the most popular alternative energy choice in recent years. One of the key determinants for choosing a location for a wind farm is whether the site has enough wind. To produce enough energy using current technology, the site should have an annual average wind speed exceeding 8 miles per hour, according to the Wind Energy Association. One candidate site in southern California was monitored for a year, with wind speeds recorded every 6 hours. A total sample of 1114 reading of wind speed averaged 8.239 mph with a sample standard deviation of 3.813 mph. The histogram of these 1114 readings is shown below. Your team was asked to perform statistical analysis to help the developer decide whether to place a wind turbine at this site. Figure 1: Histogram of Wind Speed at a candidate location 4 10. [2 pts] Describe the parameter you will be testing a hypothesis test for and state the appropriate null and alternative hypotheses for the purpose of your analysis. Parameter: : : 11. [4 pts] Determine the value of the appropriate test statistic and the p-value of the test. test-statistics: p-value: 12. [4 pts] What are the appropriate conclusion and interpretation based on the outcome of the hypothesis test at the 5% significance level? Should you recommend building a wind farm at this site? Conclusion of the test: Interpretation: 13. [4 pts] The board feels that the 5% significance level might be too lenient and wishes to invest that the claim is supported with stronger evidence. Your manager asked you to change the significance level to either or . Which one would you choose to make the null hypothesis more difficult to be rejected? With the new significance level, would you recommend building a wind farm at this site? New significance level: Your recommendation: 5 [Part B: Excel Prework for Class 9-10] The assignment consists of completing a few short tasks of performing data analysis: scatter plot, correlation table, running regression using Excel. All of them will be essential steps for data analysis in Module III (regression analysis). Completing all of the tasks will take about 15 minutes. Please follow the instructions given in each task, and submit your work (use the answer sheet word file, located in eLearning). If you do not see the Excel tools you need (such as Correlation or Regression), install them by clicking the top left Microsoft button in Excel, going into Excel Options, Add-Ins, Manage Excel Add-Ins, Go, and check the box for Analysis ToolPak. Follow the step-by-step instruction below. Task 1 (1 pt). Draw scatter plots. A scatter plot is used to visually inspect the relationship between two variables. Using Big D house data, draw a scatter plot between Big D house price ($K) on y-axis and sq. ft. of house on x-axis so that you can determine whether the two variables are related, and, if so, how. Step 1. Open Big D House Excel Data. Select the columns of data containing \"Price ($K)\" and \"Sq. Ft.\" Select Insert/Scatter with only markers to draw a scatter plot. Step 2. Check which variable is on the x-axis and which variable is on the y-axis. You may find that the house price is displayed on the x-axis and the sq. ft variable is on the y-axis. In general, a dependent variable (the variable we want to explain) should be on the y-axis and an independent (explanatory) variable should be on the x-axis. That is, we want prices on the y-axis and area on the x-axis. If your plot is already in this format, skip to Step 4; otherwise, follow step 3. Step 3. Invert the two variables so that the price is on y-axis and the sq. ft is on x-axis. There are two ways to do this. 6 Step 4. (What to submit) Copy (CTRL+C) and Paste (CTRL+V) your final scatter plot to the answer sheet word file. Write one or two sentences on the relationship you conjecture between house prices and their areas in square feet based on the scatter plot. 7 Task 2 (1 pt). Find the correlation between price and sq. ft. A correlation table is used to determine the strength of the linear relationship between any two variables. Build a correlation table (matrix) between price and sq. ft. using Excel's correlation tool. Step 1. Open Big D House Excel Data. Select Data Tab/ Data Analysis/ Correlation. Select the \"Price\" and \"Sq. Ft.\" columns. Excel Data Data Analysis Correlation Input range: Drag the data range Check Column Check if you have a label for each variable n=21 Output range Select where you want Excel to present the output 1. Select the range of the data. 2. Check \"Grouped by Column\" 3. Check \"Label in first row\" 4. Check where you want Excel to present the output. Step 2. (What to submit) Select the correlation table. And, copy (CTRL+C) and paste (CTRL+V) your correlation table to the answer sheet word file. 8 Task 3 (2 pt). Run a regression model using Excel's Regression. Regression analysis explains the value of one variable (called a dependent variable) on the basis of value(s) of other variable(s) (called independent variable). We can use the regression output for prediction and understanding the nature of the relationship. We will learn in class exactly how the tool explains the relationship. The point of this exercise is to walk you through one round of the mechanics of using the tool, so that you are better prepared to keep up with the pace of the class. Create a regression output that explains the house price in Big D (a dependent variable) using the square footage of each house as an independent variable. Step 1. Open the Excel data file. Select Data Data Analysis Regression Step 2. Select Y range (including label \"Price\"): Dependent variable Select X range (including label \"Sq. Ft\"): Independent variable Step 3. Check Label Box (since the data has labels) Step 4. Select Output range the same or athe different Want to explain the (either houseonprice (Y)sheet, given sq. ft.worksheet) (X). Step 5. Play with output options (Check residuals and residual plots). Data/Data Analysis/Regression Range of Y (incl. label) Range of X (incl. label) Check if you have labels output range output options (more on this later) Excel will create the following regression output. 9 SUMMARY SUMMARY OUTPUT OUTPUT Regression Regression Statistics Statistics Multiple 0.949 Multiple R R 0.949 R 0.901 R Square Square 0.901 Adjusted Adjusted R R Square Square 0.896 0.896 Standard 50.749 Standard Error Error 50.749 Observations 21 Observations 21 ANOVA ANOVA Regression Regression Residual Residual Total Total Intercept Intercept Sq. Sq. Ft Ft df df 1 1 19 19 20 20 SS MS SS MS 444983.240 444983.240 444983.240 444983.240 48933.010 2575.422 48933.010 2575.422 493916.250 493916.250 Coefficients Coefficients Standard Standard Error Error -16.063 34.369 -16.063 34.369 0.180 0.014 0.180 0.014 tt Stat Stat -0.467 -0.467 13.145 13.145 F Significance F Significance F F 172.781 0.000 172.781 0.000 P-value P-value 0.646 0.646 0.000 0.000 Lower Lower 95% 95% Upper Upper 95% 95% -87.997 55.872 -87.997 55.872 0.151 0.208 0.151 0.208 Step 6. (What to submit) Copy and paste the regression result to Prework Assignment Answer Sheet. Step 7. Read Class 9 synopsis to preview how to interpret and analyze the regression output. 10
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started