Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Scatter Plots and Linear Correlation Does ice cream consumption cause crime? Are older people paid more? In cottage country, are the sales of small businesses
Scatter Plots and Linear Correlation Does ice cream consumption cause crime? Are older people paid more? In cottage country, are the sales of small businesses affected by the amount of precipitation? These questions deal with possible relationships between two variables. Often the answers are not clear-cut, but in Unit 7: Two-Variable Statistics, we will investigate methods for detecting relationships between variables, for developing mathematical models of these relationships, and for making predictions using these models. A. Scatter Plots Scatter plot: Independent Variable: Dependent Variable: Linear Correlation: Line of Best Fit: Ex. 1: Statisticians in Victorian England were fascinated by the strength of resemblance between children and their parents, and they gathered huge amounts of data on the subject. One study by Sir Francis Galton (1822 - 74 1911) and his disciple Karl Pearson 272 (1857 - 1936) compared data on the heights of 1078 fathers and their sons 2 70 at maturity, one son per father. These 68 numbers would be impossible to SON'S HEIGHT grasp as a list, but a scatter plot can be very descriptive. (Each point on the diagram represents one father-son pair.) 60 GO 62 64 66 68 72 74 76 FATHER'S HEIGHT IN INCHESB. Classifying Linear Correlations Variables are said to have a linear correlation if changes in one variable tend to be proportional to changes in the other. The stronger the correlation, the more closely the data points cluster around the line of best fit. Linear correlations can be classified according to their direction (positive, negative) and their strength (none [0], weak, moderate, strong, perfect [1]). Positive correlations mean the data cloud slopes up - as one variable increases, so does the other; negative correlations mean the data cloud slopes down - as one variable increases, the other decreases. A perfect correlation means that all points fall exactly on the line of best fit. Ex. 2: Classify the relationship between the variables for the data shown in the following scatter plots (none, linear, non-linear; if linear, positive or negative; weak, moderate, strong or perfect). g). h)C. The Correlation Coefficient The scatter plot can only give a rough indication of the association between two variables. A more precise way to measure linear correlation is to calculate the correlation coefficient, r, which is a mathematical measurement of the correlation between two sets of data. It is a pure number, without units. It does not depend on the units or scale chosen for either variable. There are several ways to calculate r, depending on the data you have available and the summary statistics you have calculated already. We will not derive the expressions here although your text does a good job on p. 161 - 163. Karl Pearson, who also invented the term standard deviation, developed these formulas. He is considered to be a key figure in the development of modern statistics. Here, x represents individual values of the variable X and y represents individual values of the variable Y and n takes its usual meaning, the number of values in the data sets X and Y. We define the correlation coefficient, r. r= We can determine the relative strength of a correlation, based on the value of r, using the scale below: Whenever possible, look at the scatter plot to check for outliers and non-linear association. Outliers can have a significant impact on the calculation of the correlation coefficient. The correlation coefficient only measures linear association, rather than association in general. (More on other types of association later!) *IMPORTANT: correlation * causation Just because an r value is high (close to 1 or -1) does not mean that one variable necessarily causes the other to occur! Correlation measures association, but association is not the same as causation! There could be a number of other reasons for a high / value, including a common cause for both or some other circumstance (like poor sampling or other confounding factors)! Ex. 3: "Ice cream sales" and "crime rate" have a very high correlation. Explain why this could be.Ex. 4: Calculate the correlation coefficlent for the following data. Then classify the linear correlation
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started