Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Question Case Introduction Early this year, you joined, as a data analyst, the Business Intelligence Team of AutoMax, a used-car retailer in Salisbury, MD. The

Question

Case Introduction

Early this year, you joined, as a data analyst, the Business Intelligence Team of AutoMax, a used-car retailer in Salisbury, MD. The Team has been working on statistical models that can help understand the dynamics of used-car industry and, in particular, provide insights into pricing strategy. Your company recently obtains a sample dataset of used-car sales records and your team is now tasked to explore the dataset. The following table summarizes variables included in the dataset.

Guidelines

Show your work.

Use a=0.05 unless otherwise noted.

Always report relevant measures to justify your statement/conclusion.

PART 1: Descriptive Statistical Analysis 

1.1 Summary Statistics

1) Identify the following measures for Price and Mileage.

2) Report box plot (Box & Whisker in Excel) for Price and Mileage.

3) Based on the above measures, we can roughly picture distribution shape of each variable. Which distribution shape is wider and flatter, Price or Mileage? Explain why.

1.2 Crosstabulation

1) Complete the crosstabulation for Price and HP. (Hint: Class width for Price is 10,000. Class width for HP is 20)

2) Based on the table, discuss the potential relation and/or pattern between Price and HP.

1.1 Scatter Diagram and Correlation

1) Report (1) scatter diagram and (2) correlation coefficient of Price and Age_Months.

2) Report (1) scatter diagram and (2) correlation coefficient of Price and Mileage.

3) Based on the comparison, identify which variable, Age_Months or Mileage, has a stronger correlation with Price? Which variable is a better indicator of Price? Explain why.

1.4 Normal Distribution

The variable, Age_Months, follows normal distribution, with a mean of 56 and a standard deviation of 20. Answer the following questions. (For this question, no need to work with the dataset)

1) What is the probability that a randomly selected used car is older than 7 years?

2) What are the minimum and maximum ages (in months) of the middle 95% of the used cars?

3) Compute the percentage of used cars ranging from 3 years to 7 years.

PART 2: Inferential Statistical Analysis 

2.1 Hypothesis TestingProportion

An industry report claims that over 80% of cars are equipped with Anti-lock Brake System (ABS). Using the sample dataset, determine if the proportion of used cars with ABS is greater than 80%.

1) Develop null and alternative hypothesis.

2) Report test statistic.

3) What is p-value? What is your conclusion at the level of sig. 0.05? How would you explain your result in relation to the claim of the industry report?

2.2 Hypothesis TestingTwo Population Means

Your company would like to know if the average price of sport model is equal to that of non-sport model. (A used car is identified as sport model if the variable, Sport_Model, takes the value of 1; otherwise, 0) (Hint: standard deviations of two populations unknown)

1) Develop null and alternative hypothesis.

2) Identify the average price and variance for sport model and non-sport model.

3) Report test statistic.

4) What is p-value? What is your conclusion at the level of sig. 0.05?

2.3 Hypothesis TestingTest of Independence

One day, your manager challenges you with the question: Are number of doors (variable: Doors) and sport model (variable: Sport_Model) independent of each other?

1) Report the observed frequency table.

2) Report the expected frequency table.

3) What is p-value? What is your conclusion at the level of sig. 0.05?

PART 3: Regression Analysis 

In this part, the dependent variable (Y) is Price, with the rest of variables as independent variables (X). Use a=0.05. For the following questions, you need to explore (and experiment with) independent variables to find out the best performing models. Show your work.

3.1 Simple Linear Regression Analysis

Your team wants to identify the single variable that is most powerful in predicting price of used cars. Find out the variable and the simple linear regression model, which generates the highest R square. (Hint: Compared with a categorical variable, a continuous variable in general has a better chance.)

1) Report the result tablesSummary Output, F test, and t test.

2) Determine if the model is statistically significant, and comment on the fitness of the model.

3) Report the p-value associated with the variable and interpret the coefficient.

3.2 Multiple Linear Regression Analysis

Now, the team is asked to develop a multiple linear regression model, with 5 independent variables, that can predict price of used cars. The objective is to identify the set of 5 variables (out of 16 independent variables) that yields the highest Adjusted R square. (Hint: you need to experiment with different sets of 5 variables to find out the best performing model (with the highest adjusted R square))

1) List the 5 independent variables in your final (best) model.

2) Report the result tablesSummary Output, F test, and t test.

3) Determine if the model is statistically significant, and comment on the fitness of the model.

4) List variables that are statistically significant and those not. Interpret each coefficient of five variables.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Precalculus Enhanced With Graphing Utilities (Subscription)

Authors: Michael, Michael Sullivan III, Michael III Sullivan, III Sullivan

6th Edition

0321849108, 9780321849106

More Books

Students also viewed these Mathematics questions

Question

Explain the Neolithic age compared to the paleolithic age ?

Answered: 1 week ago

Question

What is loss of bone density and strength as ?

Answered: 1 week ago