Question
Opeele Kyinkafa sells different used 2005 GM cars in excellent condition. The value or the retail price (Price) of a car is dependent on a
Opeele Kyinkafa sells different used 2005 GM cars in excellent condition. The value or the retail price (Price) of a car is dependent on a variety of characteristics such as number of miles the car has been driven (Mileage), manufacturer of the car (Make) such as Buick, Cadillac, Chevrolet, Pontiac, SAAB, and Saturn. The car price is also influenced by the body type (Type) such as sedan, convertible, coupe, hatchback and wagon. It is also affected by number of cylinders (Cylinders) in the engine, the engine size (Liter), cruise control which is an indicator variable representing whether the car has cruise control (yes) or no otherwise and Sound indicator representing whether the car has upgraded speakers (yes) or no otherwise. All cars in this data set were less than one year old when priced and considered to be in excellent condition.
Determine which major variables are dummies and are dynamic? Identify 3 potential unobserved heterogenous factors in this dataset. Display the full descriptive statistics of the data in R. Why do you think some variables are starred (*)? What is the median for price, standard deviation for mileage, skewness for cylinder and mean for sound? Test if price is normally distributed using histogram, kernel density plot and the shapiro wilk test. Create a red scatterplot matrix of price, mileage, cylinder, liter, make. From your graph, is the correlation between price & mileage positive or negative? Whats the pearson correlation coefficient between price & mileage and between cylinder & liter correct to 2 d.p Create a correlation matrix of the dataset by finding the apt R command for converting all string/dummy variables From the matrix, identify 2 potentially good predictors? Identify 3 pairs of variables that are mostly multicollinear. Which 2 pairs of variables are least correlated? Identify 3 pairs of variables most highly correlated with Cylinder Generate a simple regression plot of price on mileage with a re fitted line. Run the whole multiple regression model (label the equation as "regprice") & use it to answer the other questions. Specify the regression model for the whole data keeping the exact names of the variables. Write the estimated regression equation. In one sentence, which variables are significant and which are not? Justify Is the regression model justifiable? Explain Interpret the coefficient of Mileage, MakeCadillac, TypeSedan, Cylinder, & Cruiseyes. Compare the price of a car with 130,000 mileage, 6 cylinders, 3.8 liters, with cruise and sound, was manufactured by Pontiac with a convertible body type with that of the price of a car with four cylinders, 120,000 mileage, 3.8 liters, without cruise nor sound, was manufactured by Cadillac with a hatchback body type. Which factor(s) appears to be causing a comparative difference (if any)? Using only the VIF, which one of the pairs of variables selected to be multicollinear may be deleted? Justify?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started