Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Scatterplots - Halima Bensmail: CS502 Quiz practice I. Introduction Most of the graphs we have examined so far have been appropriate for univariate (one variable)
Scatterplots - Halima Bensmail: CS502 Quiz practice I. Introduction Most of the graphs we have examined so far have been appropriate for univariate (one variable) data. We will now examine the relationships which may exist between two quantitative variables by graphing bivariate (two variable) data. What we will be doing is basically plotting points, something you've done many times in previous courses. We have some new terminology: Independent variable x is called the explanatory variable. Dependent variable y is called the response variable. Question: Why do these variables have these names? Example 1 (Understandable Statistics) : A large industrial plant has seven divisions that do the same type of work. A safety inspector visits each division of 20 workers quarterly. The number of work-hours devoted to safety training and the number of work-hours lost due to industryrelated accidents are recorded for each separate division in the table below. x y Division # work-hours # work-hours in safety training lost due to accidents 1 10.0 80 2 19.5 65 3 30.0 68 4 45.0 55 5 50.0 35 6 65.0 10 7 80.0 12 a. b. Make a scatterplot for these pairs of data? As the number of hours spent on safety training increases, what happens in general to the number of hours lost due to industry-related accidents? Page 1 of 5 II. Analyzing Scatterplots Scatterplots are usually analyzed according to: a. Direction (whether there is a positive association, negative association or neither) Some diagrams of these situations: Positive association Negative association No association b. Form (clusters of points, linear pattern, etc.) c. Strength of the relationship (for example, how close to a straight line do these points appear to lie?) d. Outliers (points that do not follow the general pattern of the data). Example 2 (SDA) Suppose you have the data shown in the table below. Case: Father's height (in) Son's height (in) 1 60 62 2 63 65 3 66 68 4 69 71 5 72 61 a: Construct a scatterplot? Which variable should be explanatory variable? Page 2 of 5 Example 4 (BPS) A food industry group asked 3368 people to guess the number of calories in each of several common foods. Here is a table of the average of their guesses and the correct number of calories. Food 8 oz milk 5 oz spaghetti with tomato sauce 5 oz macaroni with cheese One slice wheat bread One slice white bread 2-oz candy bar Saltine cracker Medium-size apple Medium-size potato Cream-filled snack cake Correct Calories 159 163 269 61 76 260 12 80 88 160 Guessed Calories 196 394 350 117 136 364 74 107 160 419 a. We think that how many calories a food actually has helps explain people's guesses of how many calories it actually has. With this in mind, make a scatterplot of those data. b. Describe the relationship. Is there a positive or negative association? Is the relationship approximately linear? Are there any outliers? Example 5 (BPS) Data analysts often look for a simple transformation of data that simplifies the overall pattern. Here is an example of how transforming the response variable can simplify the pattern of a scatterplot. The population of Europe grew as follows between 1750 and 1950. Year Population (million) a. 1750 125 1800 187 1850 274 1900 423 1950 594 Make a scatterplot of population against year. Briefly describe the pattern of Europe's growth. Page 4 of 5 b. Now take the logarithm of the population in each year. Plot the logarithms against the year. What is the overall pattern of this plot? Year Population (million) Log(population in millions) 1750 125 1800 187 1850 274 1900 423 1950 594 Example 6 (text) How does the fuel consumption of a car change as its speed increases? Here are data from a British Ford Escort. Speed is measured in km/hr and fuel consumption in liters of gas used per 100 km traveled. Speed 10 20 30 40 50 Mileage 21 13 10 8 7 Speed 60 70 80 90 100 Mileage 5.9 6.3 6.95 7.57 8.27 Speed 110 120 130 140 150 Mileage 9.03 9.87 10.79 11.77 12.83 a. Sketch the scatterplot here. Describe the form of the relationship. Why is it not linear? Explain why the form of the relationship does make sense. b. It does not make sense to describe the variables as either positively associated or negatively associated. Why? c. Is the relationship strong or quite weak? Explain your
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started