STATISTICS
ANOVA, ANCOVA, REGRESSION, RSTUDIO
Directions: Answer the following questions inside the picture and choose the correct answer. Explain why it is the correct answer. All the details for the question is indicated in the picture for each number.
1.
Suppose that we want to build a model that predicts the group membership of a hurricane, either tropical (0) or non-tropical (1) based on the latitude of formation of the hurricane. The response variable is the binary variable Type . new (type of hurricane) and the predictor variable is FirstLat (First Latitude). Using R, we build a model by applying the gim () function. For the logistic regression model, we specify family = "binomial". The data is available at https://userpage fu-berlin.de/soga/200/2010 data sets/hurricanes.xlax . The R code is #set up filename my . filename wilcox. test (mpg ~ am, data=mt.cars) Wilcoxon rank sum test with continuity correction data: mpg by am W = 42, p-value = 0. 001871 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox. test. . default (x = c(21.4, 18.7, 18.1, 14.3, 24.4, 22.8, : cannot compute exact p-value with ties What is the correct interpretation of the data analysis using R? a. At .05 significance level, we conclude that the gas mileage data of manual and automatic transmissions in mtear are identical populations. b. At .05 significance level, we conclude that the gas mileage data of manual and automatic transmissions in mtcar are not significant different. c. At .05 significance level, we conclude that the gas mileage data of manual and automatic transmissions in mtcar are nonidentical populations. d. At .05 significance level, we conclude that the gas mileage data of manual and automatic transmissions in mtcar are significantly different.A collection of data samples is independent if they come from unrelated populations and the samples do not affect each other. Using the Kruskal-Wallis Test, we can decide whether the population distributions are identical without assuming them to follow the normal distribution. In the built-in data set named airquality, the daily air quality measurements in New York, May to September 1973, are recorded. The ozone density is presented in the data frame column Ozone. Without assuming the data to have normal distribution, test at .05 significance level if the monthly ozone density in New York has identical data distributions from May to September 1973. An R code and its output is given below. R code and its output > head (airquality) Ozone Solar. R Wind Temp Month Day 41 190 7.4 67 5 NH 36 118 8.0 72 5 2 > kruskal . test (Ozone - Month, data = airquality) Kruskal-Wallis rank sum test data: Ozone by Month Kruskal-Wallis chi-squared = 29.267, df = 4, p-value = 6.901e-06 Which of the following is the correct interpretation of the result of data analysis using R? I. At .05 significance level. we conclude that the monthly ozone density in New York from May to September 1973 are nonidentical populations. II. At .05 significance level, we conclude that the monthly ozone density in New York from May to September 1973 are identical populations. III. At .05 significance level, we conclude that the monthly ozone density in New York from May to September 1973 are not significantly different. a. I only b. II only c. III only d. Both II and