Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Q SCI 381: Introduction to Probability and Statistics Winter 2022 Laboratory #7 (73 points) In today's lab, we will be using the file sleep.csv that

Q SCI 381: Introduction to Probability and Statistics

Winter 2022

Laboratory #7 (73 points)

In today's lab, we will be using the filesleep.csv that is available in Canvas (located in Files\Lab Datasets). The file contains three columns of data for 62 species of mammals:

TotalSleep LifeSpan Gestation
3.3 38.6 645
8.3 4.5 42
12.5 14 60
16.5 6 25
3.9 69 624
9.8 27 180
19.7 19 35
6.2 30.4 392
14.5 28 63
9.7 50 230
12.5 7 112
3.9 30 281
10.3 11 117
3.1 40 365
8.4 3.5 42
8.6 50 28
10.7 6 42
10.7 10.4 120
6.1 34 202
18.1 7 32
6.5 28 400
3.8 20 148
14.4 3.9 16
12 39.3 252
6.2 41 310
13 16.2 63
13.8 9 28
8.2 7.6 68
2.9 46 336
10.8 22.4 100
7.8 16.3 33
9.1 2.6 21.5
19.9 24 50
8 100 267
10.6 11 30
11.2 15 45
13.2 3.2 19
12.8 2 30
19.4 5 12
17.4 6.5 120
5.3 23.6 440
17 12 140
10.9 20.2 170
13.7 13 17
8.4 27 115
8.4 18 31
12.5 13.7 63
13.2 4.7 21
9.8 9.8 52
9.6 29 164
6.6 7 225
5.4 6 225
2.6 17 150
3.8 20 151
11 12.7 90
10.3 3.5 15
13.3 4.5 60
5.4 7.5 200
15.8 2.3 46
10.3 24 210
19.4 3 14
15.3 13 38

https://docs.google.com/spreadsheets/d/1s9rMfCmojB_1AUORLzWgQFnTxjoN9pcCvtPboR3LEp4/edit?usp=sharing

TotalSleep = the number of hours per day spent sleeping

LifeSpan = the maximum life span in years

Gestation = the gestation period in days

Downloadsleep.csv and import the dataset into R/RStudio using theread.csv() function. Store the data in a data frame object namedsleep using:

sleep <- read.csv("sleep.csv")

Recall from lab 6 that you can use the attach() command to attach the data to your R/RStudio workspace.

attach(sleep)

Part 1. Correlation Analysis

In part 1, we will use R/RStudio to conduct a correlation analysis. Before conducting any analyses, let's explore the dataset by plotting pair-wise scatter plots using the following command:

plot(sleep)

(1a) Paste your pair-wise scatterplot below.(2 points)

(1b) Examine the pair-wise scatterplot in (1a). Which pair of variables, if any, would you expect to be negatively correlated? Which pair of variables, if any, would you expect to be positively correlated. Justify your response.(4 points)

(1c) Consider the correlation coefficient, r, between all possible pairs of the variables within the sleep dataset. Write the null and alternative hypotheses for r in a correlation analysis.(2 points)

(1d) Now, conduct a correlation analysis between all possible pairs of the variables within the sleep dataset. Paste your code and output below for each pair of variables.(6 points)

(1e) Using the output from cor.test in (1d), what is the estimate of the correlation coefficient, r, for each pair of variables?(3 points)

(1f) Using alpha = 0.01 and the output from (1d), what is your statistical conclusion and interpretation for each pair of variables?(12 points)

Part 2. Linear Regression Analysis: Using LifeSpan to predict Gestation

(2a) In part 2, we will use R/RStudio to conduct a linear regression to determine if LifeSpan (independent variable) predicts Gestation (dependent variable). Fit a linear regression using lm(). Paste your code and output below.(2 points)

(2b) Using your output from (2a), what is the estimate of the slope of the linear regression? What is your statistical conclusion and interpretation of the slope estimate when using alpha = 0.05?(6 points)

(2c) Interpret the adjusted R-squared value from your output from (2a). What does this value represent?(4 points)

(2d) Use your output from (2a) to write the regression equation.(2 points)

(2e) Use your regression equation from (2d) to predict the Gestation time in mammals that have the following LifeSpan:(6 points)

3 years

29 years

78 years

(2f) Plot the relationship between LifeSpan and Gestation using plot(). Plot LifeSpan on the x-axis and Gestation on the y-axis. Add appropriate axis labels and a main title, and a color of your choice.

After making this plot, you can add a line of best fit based on your linear regression using the abline() function in R/RStudio:

abline(object name)

whereobject name is the object where your linear regression model was stored when using lm() in (2a). Paste your plot with your line of best fit below.(10 points)

(2g) Linear regression assumes that the residuals of the model are approximately normally distributed. To assess the residuals, let's extract the model residuals and store them in an object calledmodel.res using the following command:

model.res <- residuals(object name)

whereobject name is the object where your linear regression model was stored when using lm() in (2a).

Plot a histogram of the residuals. Include a title and a color. Paste your plot below.(6 points)

(2h) What is the mean and median of the model residuals?(2 points)

(2i) Based on your answers from (2h), and a visual assessment of your histogram in (2g), do you think the model residuals are normally distributed? Justify your answer.(6 points)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Introduction To Business Statistics

Authors: Ronald M. Weiers

7th Edition

978-0538452175, 538452196, 053845217X, 2900538452198, 978-1111524081

More Books

Students also viewed these Mathematics questions

Question

2. Define the level of significance

Answered: 1 week ago

Question

Describe Berkeleys objection to primary qualities.

Answered: 1 week ago

Question

Explain the causes of indiscipline.

Answered: 1 week ago

Question

2. In what way can we say that method affects the result we get?

Answered: 1 week ago