Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Overview HW11 Chi Square ANOVA Regression Cleaning Data with Outlier Sheet 1: HW11 Sheet 2: Chi Square An analyst at a local bank wonders if
Overview
HW11 Chi Square ANOVA Regression Cleaning Data with Outlier
Sheet 1: HW11
Sheet 2: Chi Square
An analyst at a local bank wonders if the age distribution of customers coming for service at his branch in town is the same as at a branch located near the mall. He selects 100 transactions at random from each branch and researches the age information for the associated customer. These are the data : | ||||||||||||||
Age | ||||||||||||||
less than 30 | 30-55 | 56 or older | Total | |||||||||||
In town | 25 | 37 | 38 | 100 | ||||||||||
mall | 30 | 48 | 22 | 100 | ||||||||||
Total | 55 | 85 | 60 | 200 | ||||||||||
1 | What is the null hypothesis if you want to check if the age patterns of customers are independent of bank location? | |||||||||||||
2 | What are the expected numbers for each cell in a 3 by 3 table if the null hypothesis is true? | |||||||||||||
Age | ||||||||||||||
less than 30 | 30-55 | 56 or older | Total | |||||||||||
In town | 0 | |||||||||||||
mall | 0 | |||||||||||||
Total | 0 | 0 | 0 | 0 | ||||||||||
3 | Use the chi square test to accept or reject the null hypothesis. What is the chi square test statistic? | |||||||||||||
4 | What is the chi square critical value and how many degrees of freedom does it have? Assume alpha is .05. | |||||||||||||
5 | What do you conclude? | |||||||||||||
Sheet 3: ANOVA
Saeko owns a yarn shop and want to expands her color selection. | |||||||||
Before she expands her colors, she wants to find out if her customers prefer one brand | |||||||||
over another brand. Specifically, she is interested in three different types of bison yarn. | |||||||||
As an experiment, she randomly selected 21 different days and recorded the sales of each brand. | |||||||||
At the .10 significance level, can she conclude that there is a difference in preference between the brands? | |||||||||
Misa's Bison | Yak-et-ty-Yaks | Buffalo Yarns | |||||||
799 | 776 | 799 | |||||||
784 | 640 | 931 | |||||||
873 | 822 | 794 | |||||||
702 | 812 | 920 | |||||||
795 | 673 | 731 | |||||||
875 | 893 | 837 | |||||||
Total | 4,828.00 | 4,616.00 | 5,012.00 | ||||||
6) | What is the null hypothesis? | ||||||||
What is the alternative hypothesis? | |||||||||
What is the level of significance? | |||||||||
7) | Use Tools - Data Analysis - ANOVA:Single Factor | ||||||||
to find the F statistic: | |||||||||
8) | From the ANOVA output: What is the F value? | ||||||||
What is the F critical value? | |||||||||
9) | What is your decision? | ||||||||
Explain in statistical terms | |||||||||
Sheet 4: Regression
Studies have shown that the frequency with which shoppers browse Internet retailers is related to the frequency with which they actually purchase products and/or services online. The following data show respondents age and answer to the question "How many minutes do you browse online retailers per year?" | ||||||||||||||
Note that this sheet includes questions 10-16 | ||||||||||||||
Age (X) | Time (Y) | |||||||||||||
16 | 420 | |||||||||||||
17 | 269 | |||||||||||||
19 | 315 | |||||||||||||
22 | 337 | |||||||||||||
22 | 243 | |||||||||||||
22 | 459 | |||||||||||||
22 | 414 | |||||||||||||
28 | 224 | |||||||||||||
28 | 381 | |||||||||||||
28 | 412 | |||||||||||||
28 | 576 | |||||||||||||
30 | 333 | |||||||||||||
33 | 551 | |||||||||||||
34 | 548 | |||||||||||||
35 | 626 | |||||||||||||
35 | 521 | |||||||||||||
35 | 562 | |||||||||||||
36 | 699 | |||||||||||||
39 | 643 | |||||||||||||
39 | 455 | |||||||||||||
40 | 666 | |||||||||||||
42 | 553 | |||||||||||||
43 | 459 | |||||||||||||
44 | 525 | |||||||||||||
48 | 559 | |||||||||||||
50 | 507 | |||||||||||||
50 | 612 | |||||||||||||
51 | 710 | |||||||||||||
52 | 378 | |||||||||||||
54 | 566 | |||||||||||||
58 | 652 | |||||||||||||
59 | 725 | |||||||||||||
60 | 695 | |||||||||||||
10) | Use Data > Data Analysis > Correlation to compute the correlation checking the Labels checkbox. | |||||||||||||
11) | Use the Excel function =CORREL to compute the correlation. If answers for #1 and 2 do not agree, there is an error. | |||||||||||||
The strength of the correlation motivates further examination. | ||||||||||||||
12) | a) Insert Scatter (X, Y) plot linked to the data on this sheet with Age on the horizontal (X) axis. | |||||||||||||
b) Add to your chart: the chart name, vertical axis label, and horizontal axis label. | ||||||||||||||
c) Complete the chart by adding Trendline and checking boxes | ||||||||||||||
Read directly from the chart: | ||||||||||||||
13) | a) Intercept = | |||||||||||||
b) Slope = | ||||||||||||||
c) R2 = | ||||||||||||||
Perform Data > Data Analysis > Regression. | ||||||||||||||
14) | Highlight the Y-intercept with yellow. Highlight the X variable in blue. Highlight the R Square in orange | |||||||||||||
15) | Use Excel to predict the number of minutes spent by a 22-year old shopper. Enter = followed by the regression formula. | |||||||||||||
Enter the intercept and slope into the formula by clicking on the cells in the regression output with the results. | ||||||||||||||
16) | Is it appropriate to use this data to predict the amount of time that an 85-year-old will spend browsing online retailers ? | |||||||||||||
If yes, what is the amount of time, if no, why? | ||||||||||||||
Sheet 5: Cleaning Data with Outlier
17) | On this worksheet, make an XY scatter plot linked to the following data: | ||||||
X | Y | ||||||
1.01 | 2.8482 | ||||||
1.48 | 4.2772 | ||||||
1.8 | 4.788 | ||||||
1.81 | 5.3757 | ||||||
1.07 | 2.5252 | ||||||
1.53 | 3.0906 | ||||||
1.46 | 4.3362 | ||||||
1.38 | 3.2016 | ||||||
1.77 | 4.3542 | ||||||
1.88 | 4.8692 | ||||||
1.32 | 3.8676 | ||||||
1.75 | 3.9375 | ||||||
1.94 | 5.7424 | ||||||
1.19 | 2.4752 | ||||||
1.31 | 26.2 | ||||||
1.56 | 4.5708 | ||||||
1.16 | 2.842 | ||||||
1.22 | 2.44 | ||||||
1.72 | 5.1256 | ||||||
1.45 | 4.3355 | ||||||
1.43 | 4.2471 | ||||||
1.19 | 3.5343 | ||||||
2 | 5.46 | ||||||
1.6 | 3.84 | ||||||
1.58 | 3.8552 | ||||||
18) | Add trendline, regression equation and r squared to the plot. | ||||||
Add this title. ("Scatterplot of X and Y Data") | |||||||
19) | The scatterplot reveals a point outside the point pattern. Copy the data to a new location in the worksheet. You now have 2 sets of data. | ||||||
Data that are more tha 1.5 IQR below Q1 or more than 1.5 IQR above Q3 are considered outliers and must be investigated. | |||||||
It was determined that the outlying point resulted from data entry error. Remove the outlier in the copy of the data. | |||||||
Make a new scatterplot linked to the cleaned data without the outlier, and add title ("Scatterplot without Outlier,") trendline, and regression equation label. | |||||||
X | Y | ||||||
1.01 | 2.8482 | ||||||
1.48 | 4.2772 | ||||||
1.8 | 4.788 | ||||||
1.81 | 5.3757 | ||||||
1.07 | 2.5252 | ||||||
1.53 | 3.0906 | ||||||
1.46 | 4.3362 | ||||||
1.38 | 3.2016 | ||||||
1.77 | 4.3542 | ||||||
1.88 | 4.8692 | ||||||
1.32 | 3.8676 | ||||||
1.75 | 3.9375 | ||||||
1.94 | 5.7424 | ||||||
1.19 | 2.4752 | ||||||
1.56 | 4.5708 | ||||||
1.16 | 2.842 | ||||||
1.22 | 2.44 | ||||||
1.72 | 5.1256 | ||||||
1.45 | 4.3355 | ||||||
1.43 | 4.2471 | ||||||
1.19 | 3.5343 | ||||||
2 | 5.46 | ||||||
1.6 | 3.84 | ||||||
1.58 | 3.8552 | ||||||
Compare the regression equations of the two plots. How did removal of the outlier affect the slope and R2? Explain why the slope and R Square change the way they did | |||||||
20) |
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started