Question
animal gestation longevity baboon 187 20 bear, black 219 18 bear, grizzly 225 25 bear, polar 240 20 beaver 122 5 buffalo 278 15 camel
animal gestation longevity
baboon 187 20
bear, black 219 18
bear, grizzly 225 25
bear, polar 240 20
beaver 122 5
buffalo 278 15
camel 406 12
cat 63 12
chimpanzee 231 20
chipmunk 31 6
cow 284 15
deer 201 8
dog 61 12
donkey 365 12
elephant 645 40
elk 250 15
fox 52 7
giraffe 425 10
goat 151 8
gorilla 257 20
guinea pig 68 4
hippopotamus 238 25
horse 330 20
kangaroo 42 7
leopard 98 12
lion 100 15
monkey 164 15
moose 240 12
mouse 21 3
opossum 15 1
pig 112 10
puma 90 12
rabbit 31 5
rhinoceros 450 15
sea lion 350 12
sheep 154 12
squirrel 44 10
tiger 105 16
wolf 63 5
zebra 365 15
Our dataset contains the following variables:
animal: the name of the animal species.
gestation: the average gestation period of the species, in days.
longevity: the average longevity of the species, in years.
Select all of the data in columns B and C, and then in the Insert tab choose Scatter in the Charts group.
Choose the first scatterplot option (Scatter with only Markers).
Click in a cell outside of the first three columns of data.
Type =correl(B2:B41,C2:C41)
Scroll down to row 16 of the data. You will see that this contains the values of the variables for the elephant.
Click on the row header 16 to select the entire row of data.
Right-click and choose Delete from the pop-up menu to delete the row.
You will see that the values of the variables for the elephant have been removed from the data. Notice also that the correlation between gestation and longevity has changed.
Question 2:
Report the new value for the correlation between gestation and longevity and compare it to the value you found earlier when the outlier was included. What is it about this outlier that results in the fact that its inclusion in the data causes the correlation to increase? (Hint: look at the scatterplot.)
In the last activity, we saw an example where there was a positive linear relationship between the two variables, and including the outlier just "strengthened" it. Consider the hypothetical data displayed by the following scatterplot:
scatterplot where the dots are loosely clustered at the upper right part of the graph
In this case, the low outlier gives an "illusion" of a positive linear relationship, whereas in reality, there is no linear relationship between X and Y.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started