Sales of Toyota Corolla Cars. The file ToyotaCorolla.csv contains data on used cars (Toyota Corollas) on sale
Question:
Sales of Toyota Corolla Cars. The file ToyotaCorolla.csv contains data on used cars (Toyota Corollas) on sale during late summer of 2004 in the Netherlands. It has 1436 records containing details on 38 attributes, including Price, Age, Kilometers, Horsepower (HP), and other specifications. The goal will be to predict the price of a used Toyota Corolla based on its specifications.
a. Identify an outlier observation, possibly caused due to a data entry error. This can be done by analyzing the data summary as shown in Figure 4.1. Filter out this observation for the subsequent analysis.
b. Identify the categorical attributes.
c. Explain the relationship between a categorical attribute and the series of numeric attributes with dummy coding derived from it.
d. How many numeric attributes with dummy coding are required to capture the information in a categorical attribute with N categories?
e. Use RapidMiner to convert the categorical attributes in this dataset into numeric attributes with dummy coding, and explain in words, for one record, the values in the derived numeric dummy attributes.
f. Use RapidMiner to produce a correlation matrix along with a heatmap. Comment on the relationships among attributes
Step by Step Answer:
Machine Learning For Business Analytics
ISBN: 9781119828792
1st Edition
Authors: Galit Shmueli, Peter C. Bruce, Amit V. Deokar, Nitin R. Patel