Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. Which of the following is NOT true about linear regression? Linear regression is used to predict new values of the target variable. Linear regression

1. Which of the following is NOT true about linear regression?

Linear regression is used to predict new values of the target variable.

Linear regression allows us to predict new values of the independent variable.

Linear regression allows us to model how the target variable changes with the independent variables.

In linear regression, the target variable is a continuous quantity.

2. The ordinary least squares (OLS) algorithm ________________ .

Maximizes the sum of square residuals

Minimizes the sum of square residuals

Minimizes the square of the sum of residuals

Maximizes the square of the sum of residuals

3.Overfitting occurs when _____________.

Our model becomes too specific to the training data

The sum of square residuals is too large

The average of the errors is positive

Our model does not have enough complexity

4. Using multiple linear regression to add in more independent variables ___________.

reduces the overfitting of the data

allows us to add more observational data to the model

allows us to fit a non-linear model to the data

can help explain more variation in the target variable

5. Multicollinearity is the phenomenon where _________________.

the target variable is strongly correlated with an independent variable

the target variable is strongly correlated with the residuals

the independent variables are strongly correlated with the residuals

the independent variables are strongly correlated with other independent variables

6. Which of the following is NOTan assumption of ordinary least squares (OLS):

Linearity

Endogeneity

Random Sampling

Homoscedasticity of Errors

7. Which assumption of OLS assumes that there is no correlation between the error and the independent variables?

Zero Mean Errors

Multicollinearity

Autocorrelation of Errors

Endogeneity

8. A regression analysis between sales (S) (in $1000) and price (P) (in dollars) resulted in the following equation: S = 50,000 - 8P

The above equation implies that an ___________.

increase of $1 in price is associated with a decrease of $8 in sales

increase of $1 in price is associated with a decrease of $42,000 in sales

increase of $1 in price is associated with a decrease of $8000 in sales

increase of $8 in price is associated with an increase of $8,000 in sales

9. Suppose we build a model to predict a store's sales with three independent variables; customers per day, average daily temperature, and number of products available. If we calculate the p-values for these variables as below, which variables are significant and should be kept in the model? Select all that apply.

Variable p-Value
Customers per day (I) 0.0
Average daily temperature (II) 0.54
Number of products available (III) 0.03

Variable I

Variable II

Variable III

10. Suppose we have produced a simple linear regression model with the following form: y = 0.65x + 2.9

We then calculate the coefficient of determination as 0.92 and a p-value of 0.1. Which of the following best describes our model?

The model explains a high amount of variance, and the slope is statistically significant

The model explains a low amount of variance, but the slope is statistically significant

The model explains a low amount of variance but the slope is statistically significant

The model explains a high amount of variance but the slope is statistically insignificant

11. Which of the following evaluation metrics is relative to the total error?

Root mean square error

Coefficient of determination

Mean square error

Mean absolute error

12. Which method of regression produces a probability distribution as opposed to a point estimate?

Poisson Regression

Logistic Regression

Bayesian Regression

LASSO Regression

13. You are given a dataset of air pollution readings from several locations in an urban setting. The measurements are taken every hour and include information about traffic flow. To perform regression on this longitudinal data, what kind of regression technique would you use?

Polynomial Regression

Log-Log Regression

Repeated Measures Regression

LASSO Regression

14. You are working with customer data from a large video-on-demand provider, which contains numerical fields with information such as average number of hours watched per month, number of logins per month, time spent browsing per month etc. In this data, there is a flag that indicates whether the customer canceled the service or not (1 for yes, 0 for no). You are looking to build a model from this data to classify what current customers will cancel. What type of model would you use?

Logistic Regression

Random Effects

Bayesian Regression

Poisson Regression

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

University Calculus Early Transcendentals, Multivariable

Authors: Joel R Hass, Maurice D Weir, George B Thomas Jr

2nd Edition

0321830849, 9780321830845

More Books

Students also viewed these Mathematics questions

Question

Be relaxed at the hips

Answered: 1 week ago