Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Page 50 of 51 ZOOM + L3.2 Exercises Exercise 5 (Datasarus). Consider the Result 2 CAUTION (abt slide 42) regarding the use of the 5-statistic
Page 50 of 51 ZOOM + L3.2 Exercises Exercise 5 (Datasarus). Consider the Result 2 CAUTION (abt slide 42) regarding the use of the 5-statistic summary of a sample bivariate distn: If the univariate dist(X:s) and dist(Y: s) are both bell-shaped and the X-Y relation is linear, then the bivariate distn dist (X, Y: s) can be usefully summarized using the five statistics: x, Sx, y , Sy, andr (and n). However, if the univariate distns are not bell-shaped and/or the X-Y relation is not linear, then the five- statistic summary of the bivariate distn is inappropriate and/or insufficient. Inappropriate: If univariate distns are not bell-shaped, then the 2-number summaries (mean, sd) are inappropriate, and if the relationship is non-linear then the correlation is an inappropriate measure of association. Insufficient: If the univariate distns are not bell-shaped and/or the relationship is non-linear, then there are many, very different bivariate distributions, e.g., with very different associations, that have the exact same five statistic summary values. Moral of the Story: Always "look" at your data first (e.g., univariate histograms and bivariate scatterplot), before numerically summarizing it! To see the caution and moral in action, access the Excel file, datasaurus.xIsx. This xIsx file contains 14 worksheets. (a) Compute the five-statistic summary of the bivariate dist (X, Y:s) for the three data sets stored in the first three worksheets labeled "noise", "star", and "dino". Report all values to the fourth decimal place. Comment on the comparisons across the three data sets. (b) For each of the three data sets considered in part (a), graphically display the bivariate distribution dist (X, Y: s) using a scatterplot. Do they all look the same? If not describe each scatterplot. Data source info: datasaurus.xIsx is a slightly modified variant of the original data described in the paper at https://damassets.autodesk.net/content/dam/autodesk/research/publications-assets/pdf/same-stats-different-graphs.pdf. In the R package 'Datasaurus," it is stated that, "The Datasaurus was created by Alberto Cairo. Datasaurus shows us why visualisation is important, not just summary statistics." 51
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started