Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Suppose you are a data scientist working for a university. During the summer, the university considers potential learners to partake in a summer program. The

Suppose you are a data scientist working for a university. During the summer, the university considers potential learners to partake in a summer program. The following dataset represents the data collected by the university as well as the "Admit" label (with 0 as do not admit and 1 as admit).
Consider the following dataset and the visualization (created in R using corrplot). The visualization illustrates the r value (correlation) between each pair of variables in the dataset.
Note: The code used to generate the correlation plot is:
library(corrplot)
library(RColorBrewer)
(AdmitData5D_Bi <- read.csv("AdmissionDataForLogReg5D.csv"))
(CorrelationMatrix <- cor(AdmitData5D_Bi))
corrplot(CorrelationMatrix, method="number", order="hclust",
diag = FALSE, type="lower",cl.pos ='n',
col=brewer.pal(n=10, name="Blues"))
Admit GPA Prescore VolunteerHours Height
02.55165
01.84.5070
13.78.3866
13.49.2573
02.54.9262
13.87.91167
13.38.8874
02.23.2175
01.26376
13.59.2571
Height
Admit 0.07 Admit
PreScore 0.10.94 PreScore
GPA -0.290.90.76 GPA
VolunteerHours -0.040.860.780.79
Choose the statement that is most accurate.
Question 4 options:
a) Because the dependent variable (the label called Admit) is binomial (0 or 1), you would likely elect to apply multiple linear regression to create a model for whether students might be admitted to the program or not.
b) The linear correlation between PreScore and Admit is .10 and similarly the linear correlation between VolunteerHours and Height is .79.
c) You know, as a data scientist, that one way to reduce the complexity of a model is to increase the dimensionality (the number of columns or variables) in a dataset.
d) To improve the model and reduce the complexity of the model you elect to eliminate Height from the dataset as the best variable to remove.
e) By looking at the correlation matrix visualization, you can see that all of the independent variables are strongly correlated with the dependent variable "Admit".

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Master The Art Of Data Storytelling With Visualizations

Authors: Alexander N Donovan

1st Edition

B0CNMD9QRD, 979-8867864248

More Books

Students also viewed these Databases questions