Answered step by step
Verified Expert Solution
Question
1 Approved Answer
I Questions for Thought . 4. A product line is sold in 15 different. congura- tions of packaging. How does the large number of package
I Questions for Thought . 4. A product line is sold in 15 different. congura- tions of packaging. How does the large number of package types inuence the value of the chi- squared statistic? why are chi-squared statistics not directly comparable between tables of different dimen- sions when the null hypothesis of independence holds? Could Cramet's V (Chapter 5} have been used rather than p-values to standardize the results? Give an advantage and a disadvantage of p-values compared to Cremet's If statistics. Dt'the 650 products, 69 come in ve types of packaging If packaging type and location are independent. what should be the average value of these 69 chi-squared statistics? Suppose managers emluale the association hehaeen package type and location for 50 products for which these are independent attributes. The data in each table are indepen- dent of the data in other tables. a. How mans,r of dies: 50 pvalues would be expected to be less than 0.05? b. What is the probability that at least one p-value would be less than 0.0!'2l c. If the smallest pvvalue is less than 0.01. should we conclude that package type and location for this product are associated? The datausod in the cinema-ed analysis have 200 cases foreech location. 15 it mic: have the same umber of observations from each location for every product? 7. The histogram nip-values {Figure 4} shows. that 34 products has: p-value lea than 0.025. Dues MyLab a X e Text X Pearson- X w Categori X w CaseCAF X $ 2022070 x M Inbox (2 x E TGT-Pap x * Course x G Question X + V X - > C A plus.pearson.com/courses/li72517/products/118733/pages/521?locale=&platformld=1030&iesCode=x4bvUWxvR3 TM Search X dependence compares observed frequencies in a parison in Chapter 18. When used to search for pat- contingency table to those expected if the underly- terns in data mining, we need to assign p-values to ing random variables are independent. As a special the test statistic to adjust for the effects of sample Q 495 Clear case, we can use this test to compare the equality of sizes and the size of the contingency table. Key Terms (0 > All results Pages (1) Content (1) chi-squared test, 492 data mining, 493 Questions for Thought Pages 1. A product line is sold in 15 different configura- 50 products for which these are independent tions of packaging. How does the large number attributes. The data in each table are indepen of package types influence the value of the chi- dent of the data in other tables. Page 495 squared statistic? a. How many of these 50 p-values would be 2. Why are chi-squared statistics not directly expected to be less than 0.05? comparable between tables of different dimen- . What is the probability that at least one sions when the null hypothesis of independence p-value would be less than 0.01? Content holds? C. If the smallest p-value is less than 0.01, 3. Could Cramer's V (Chapter 5) have been used should we conclude that package type rather than p-values to standardize the results? and location for this product are Give an advantage and a disadvantage of associated? AA ; (QT) 495 -496 Performance .Types (SA) 491- p-values compared to Cramer's V statistics. 6. The data used in the chi-squared analysis have 495 4. Of the 650 products, 69 come in five types of packaging. If packaging type and location are 200 cases for each location. Is it necessary to independent, what should be the average value have the same number of observations from of these 69 chi-squared statistics? each location for every product? 5. Suppose managers evaluate the association 7. The histogram of p-values (Figure 4) shows that between package type and location for 84 products have p-value less than 0.025. Does X 495 O 9 W 4:36 PM A 7/13/2022 2MyLab a X e Text X Pearson- X w Categori X w CaseCAF X $ 2022070 x M Inbox (2 x E TGT-Pap x * Course x G Question X + V X - > C A plus.pearson.com/courses/li72517/products/118733/pages/521?locale=&platformld=1030&iesCode=x4bvUWxvR3 TM Search X dependence compares observed frequencies in a parison in Chapter 18. When used to search for pat- contingency table to those expected if the underly- terns in data mining, we need to assign p-values to ing random variables are independent. As a special the test statistic to adjust for the effects of sample Q 495 Clear case, we can use this test to compare the equality of sizes and the size of the contingency table. Key Terms (0 > All results Pages (1) Content (1) chi-squared test, 492 data mining, 493 Questions for Thought Pages 1. A product line is sold in 15 different configura- 50 products for which these are independent tions of packaging. How does the large number attributes. The data in each table are indepen of package types influence the value of the chi- dent of the data in other tables. Page 495 squared statistic? a. How many of these 50 p-values would be 2. Why are chi-squared statistics not directly expected to be less than 0.05? comparable between tables of different dimen- . What is the probability that at least one sions when the null hypothesis of independence p-value would be less than 0.01? Content holds? C. If the smallest p-value is less than 0.01, 3. Could Cramer's V (Chapter 5) have been used should we conclude that package type rather than p-values to standardize the results? and location for this product are Give an advantage and a disadvantage of ; (QT) 495 -496 Performance .Types (SA) 491- associated? AA p-values compared to Cramer's V statistics. 6. The data used in the chi-squared analysis have 495 4. Of the 650 products, 69 come in five types of packaging. If packaging type and location are 200 cases for each location. Is it necessary to independent, what should be the average value have the same number of observations from of these 69 chi-squared statistics? each location for every product? 5. Suppose managers evaluate the association 7. The histogram of p-values (Figure 4) shows that between package type and location for 84 products have p-value less than 0.025. Does X 495 O 9 W 4:37 PM A 7/13/2022 2Mail - Minney, Ty X My Sac State | Sac X _ MyLab and Maste X Je Text X P Pearson+ X Ask a Question X California State U X + V X C A plus.pearson.com/courses/li72517/products/118733/pages/521?locale=&platformld=1030&iesCode=uG6EL1/iSn TM Search X DATA MINING USING CHI-SQUARED 495 Q 495 Clear If the null hypothesis of independence held for 1. Calculate a lot of test statistics every product, then we'd expect to find a uniform dis- 2. Study those with the most statistically significant (0 > All results Pages (1) Content (1) tribution in this histogram. For example, there's a 5% outcomes chance of a p-value being less than 0.05 even though Ho holds. The histogram in Figure 4 looks rather flat, is a common paradigm in modern quantitative except for the prominent bar at the left side. biology. Scientists use this approach to iden- The tall bar at the left of Figure 4 identifies prod- tify genes that may be responsible for an illness. Pages ucts with the most statistically significant values of Genetic tests are performed on two samples of x'. Each of the intervals that define the histogram individuals, one that is healthy (the control group) Page 495 in Figure 4 has length 0.025, so we'd expect about and the other that has an illness that is suspected 0.025 x 650 = 16 in each. The interval from 0 to of having a genetic connection (such as various 0.025 has 84. These products are the most sensi- types of cancers). For each subject, a genetic anal- tive to location. Managers responsible for inventory ysis determines the presence or absence of 10,000 control would be well served to start with them. For or more genes. That's a data table with 10,000 col- Content instance, a manager could begin with the product umns for each subject. To reduce the volume of with the smallest p-value and work in order. data, scientists use a basic test that compares two groups, like the two-sample t-test, to contrast the amount of each gene present in the control group AA ; (QT) 495 -496 Performance .Types (SA) 491- Related Methods to the amount present in the studied group. Next, 495 The analysis shown here has two attributes that each test is converted into a p-value as we did with appear in many other situations. For example, anal- x. The most significant tests indicate genes for yses that further, more extensive study. CASE SUMMARY The chi-squared test of the null hypothesis of in- several proportions, extending the two-sample com- dependence compares observed frequencies in a parison in Chapter 18. When used to search for pat- X 495 92. F 9:54 PM Sunny 7/13/2022 2Mail - Minney, Ty X My Sac State | Sac X _ MyLab and Maste X e Text X P Pearson+ X Ask a Question X California State U X + V X C A plus.pearson.com/courses/li72517/products/118733/pages/522?locale=&platformld=1030&iesCode=uG6EL1/iSn TM Search X 496 PART III Statistics in Action Q 495 Clear this imply that if we were to examine all of the 8. Explain how the analysis of packaging types transactions for these products that we would could be used to manage the mix of colors or (0 > All results Pages (1) Content (1) find Location and Package Type associated for all sizes of apparel in clothing stores that operate 84 of them? in different parts of the United States. Pages About the Data The packaging data are based on an analysis developed by several students participating in Wharton's Executive Page 495 MBA program. Content AA ; (QT) 495 -496 Performance .Types (SA) 491- 495 X 496 92. F 9:54 PM Sunny 9 7/13/2022 2
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started