Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Could you help me to check this assignment, it is important for my grades, could you help me to check if there is something wrong,

Could you help me to check this assignment, it is important for my grades, could you help me to check if there is something wrong, especially for question 4 and the last question. appreciate.

request:

The data in this document gives the number of meals eaten that contain fish (per week) and mercury levels in head hair for 100 fisherman.Save the data to a format that can be read into R.Read the data in for analysis.Use R to calculate the quantities and generate the visual summaries requested below.(1) To get a sense of the data, generate a scatterplot (using an appropriate window, label the axes, and title the graph).Consciously decide which variable should be on the -axis and which should be on the y-axis.Using the scatterplot, describe the form, direction, and strength of the association between the variables. (3 points )(2) Calculate the correlation coefficient.What does the correlation tell us? (2 points)(3) Find the equation of the least squares regression equation, and write out the equation.Add the regression line to the scatterplot you generated above.(2 points )(4) What is the estimate for ?How can we interpret this value?What is the estimate for ?What is the interpretation of this value?(6 points)(5) Calculate the ANOVA table and the table which gives the standard error of .Formally test the hypothesis that = 0 using either the -test or the -test at the = 0.10 level.Either way, present your results using the 5 step procedure as in the course notes.Within your conclusion, calculate the -squared value and interpret this.Also, calculate and interpret the 90% confidence interval for . (8 points )

1.Generate a scatterplot; label the x-axes and y-axes, using the scatterplot to describe the form, direction and strength of the association between the variables.

In order to summarize the data for the number of meals with fish and total mercury in mg/g, firstly I save the data given in the excel table with csv file. Then, I write the code in Rstudio to read the data as below:

data<-read.table("/Users/Mac/Desktop/fishmeal&totalMercury.csv",header = TRUE, sep = ',')

Then, convert and flatten the data from the data frame, by using the code:

data1 <- unlist(data, use.names = TRUE)

Then, define and get the value of number of meals with fish and total mercury in mg/g in the dataset by using the code:

Number.of.meals.with.fish <- data[[1]]

Total.Mercury.in.mg.g<- data[[2]]

We create a new data frame out of number of meals with fish and total mercury in mg/g and define it with my. data by using the code:

my.data<-data.frame(Number.of.meals.with.fish,Total.Mercury.in.mg.g)

Produce the scatterplot by using the code:

plot(Number.of.meals.with.fish,Total.Mercury.in.mg.g)

The produced scatterplot is shown as below:

Describe the scatterplot:

The x-axes (explanatory variable) of this scatterplot is Number.of.meals.with.fish, and the y-axes (response variable) of this scatterplot is Total.Mercury.in.mg.g.

The points tend towards a straight line pattern for this scatterplot; as a result, the form of it is linear correlation.

The direction of the scatterplot is positive, as a result, the relationship between two factors is positively associated, when the Number.of.meals.with.fish in value increase, the Total.Mercury.in.mg.g in value increase.

The strength of the association between the factors is not very closely, as a result, it is not that strength.

2.Calculate the correlation coefficient, what does the correlation tell us.

To calculate the correlation coefficient in Rstudio, by using the code as below:

cor(my.data)

Produced the result in Rstudio shown as below:

Number.of.meals.with.fish Total.Mercury.in.mg.g

Number.of.meals.with.fish1.00000000.6991094

Total.Mercury.in.mg.g0.69910941.0000000

As a result, the coefficient is around 0.699.

Explanation: If the correlation coefficient(r) number is between 0 to 1, the two variables should be positive related, the more close to 1, the more strong they related. If the correlation coefficient(r) number is between 0 to -1, the two variables should be negative related, the more the absolute value of the r, the more strong they related. For this scatterplot, the coefficient correlation number is 0.699, as a result, it is some kind of related, but not very strong.

3.Find the equation of the least squares regression equation, and write out the equation. Add the regression line to the scatterplot you generated above.

In order to fit the liner regression model into data, the variables are Number.of.meals.with.fish and Total.Mercury.in.mg.g, and store the resulted model into variable m in the future use, the code in Rstudio is shown as below:

m<-lm(formula=Total.Mercury.in.mg.g~Number.of.meals.with.fish)

Then, in order to find everything for the linear regression variables, summary m by using the code:

summary(m)

get the correspondingly result in Rstudio:

Call:

lm(formula = Total.Mercury.in.mg.g ~ Number.of.meals.with.fish)

Residuals:

Min1Q Median3QMax

-5.718 -1.143 -0.1831.0444.379

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept)1.687640.298335.657 1.53e-07 ***

Number.of.meals.with.fish0.275950.028519.679 6.01e-16 ***

---

Signif. codes:0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.817 on 98 degrees of freedom

Multiple R-squared:0.4888, Adjusted R-squared:0.4835

F-statistic: 93.69 on 1 and 98 DF,p-value: 6.013e-16

The relation equation of the least squares regression is:

From the result generated above, the intercept for 0 is around 1.687, and the coefficient number of them is around 0.275.

As a result, the equation for the least square regression is:

Add the regression line to the scatterplot in Rstudio by using the code:

abline(m, lty=3, col="blue")

the correspondingly result generated is shown as below, which include a dotted blue line:

4.What is the estimate for ? How can we interpret this value? What is the estimate for ? What is the interpretation of this value?

Depend the above generated summary result in question3, the1 estimated value is around 0.275, and the0 estimated value is around 1.687.

Interpretation: is the intercept, which is the value of y when x = 0; 1 is the slope of the y =, which is the expected change in y for each one-unit change in x.

5.Calculate the ANOVA table and the table which gives the standard error of . Formally test the hypothesis that 1 = 0 using F-test at the level. Either way, present your results using the 5 step procedure as in the course notes.

Within your conclusion, calculate the R-squared value and interpret this. Also, calculate and interpret the 90% confidence interval for .

In order to calculate the ANOVA table, use the code in Rstudio as below:

anova(m)

the corresponding result is shown as below:

Response: Total.Mercury.in.mg.g

Df Sum Sq Mean Sq F valuePr(>F)

Number.of.meals.with.fish1 309.24 309.23993.689 6.013e-16 ***

Residuals98 323.473.301

---

Signif. codes:0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The standard error of equals to the square root of the sum of square for error (SSE) divided by the degree of freedom (n-k), depend on the generated table above, the df for is 98 and the SSE is 323.47, as result, the value for the standard error of is round to 0.184.

Then, test the hypothesis using the t-test at the level by using the five step procedure as below

(1)Set up the hypotheses and select the alpha level

H0

H1

(2)Select the appropriate test statistic:

t = ,df = n-k-1

(3)State the decision rule:

Determine the appropriate value from the t-distribution with nk1=10021=97 degrees of freedom and associated with a right hand tail probability of = = 0.05

Using R

qt(0.95 , df = 98) = 1.660551

Decision Rule: Reject H0 if t 1.661 or if t-1.661. Otherwise, do not reject H0.

(4)Compute the test statistic

t = = 0.275/ 0.184 1.495

Conclusion

Since 1.495 < 1.661, we do not reject the null hypothesis H0, which means at the significant evidence of = 0.10 level, we do not reject the hypothesis that there is no linear relationship between the number of meals with fish and the total mercury in mg/g.

The R square value is: = 0.699 *0.699 0.489, which is the coefficient of determination, the range of R square is [0,1]. The closer it is to1, it indicates that the variables of the equation have a stronger ability to explain y, and this model also fits the data well, for this case, R square equals to 0.489, which means this model does not fit the data well.

To calculate the 90% confidence interval for by using the equation as below:

`X *

`x = 0.275 , by checking the standard normal distribution table, find the is 1.64485, = 17.58 , n = 100.

As a result, the 90% confidence interval is 0.275= 0.275 0.275= [- 2.617, 3.167]

There is a 90% probability that, the true value of the parameter will fall within -2.617 and 3.167 interval.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Intermediate Algebra With Trigonometry

Authors: Charles P McKeague

1st Edition

1483218759, 9781483218755

More Books

Students also viewed these Mathematics questions