Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Simpson's paradox is a phenomenon that can occur when data are aggregated. or put all together into one group. Aggregate data can show one thing.
Simpson's paradox is a phenomenon that can occur when data are aggregated. or put all together into one group. Aggregate data can show one thing. but the same data broken down into two or more groups based on a third variable (a lurking variable} can actually give the opposite statistical result. In this discussion you'll think through how this is possible and why statistics based on aggregate data can be misleading. Consider the following example. In major league baseball in 1989. Andy Van Slyke had a batting average of .237. and Dave Justice's batting average was .235. (A higher batting average means that a player had more base hits per time at bat.) In 1990. Van Slyke batted .284 and Justice batted .282. Van Slyke had the higher batting average two years in a row. But when the two years are combined. Van Slyke's batting average is only .261. while Justice's average is .278. \"M\" Justice .235 .282 Here are some hints to help you understand the situation better: number of hits I Here's the formula for batting average: \"\"m'bemj \"m\" atba: To combine the batting averages over two years. you'd need to divide the total number of hits by (.235 + . 282) the total number of at-bats. So Justice's average over both years is not simply 2 because this does not account for the total number of times he was at bat in both years. Respond to at least two of the following sets of questions or at least two other student's postings: 1. How is it possible that Van Slyke's combined average for the two seasons is lower than Justice's. when his average in each season is higher? 2. What is the lurking variable in this situation? Think about which batting statistics represent the aggregate data and which statistics represent the grouped data. How are the data grouped to produce the disaggregate statistics (statistics based on the data broken down into groups}? 3. What could be going on to cause this particular grouping to produce these statistics? Your answer here doesn't have to reect any knowledge about baseball as long as you support your answer with logic based on potential relationships between variables. 4. Who is the better hitter. Dave Justice. or Andy Van Slyke? On what statistics do you base your assessment. and why
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started