Problem Statement: The 'Hair Salon.csv' dataset contains various variables used for the
context of Market Segmentation. This particular case study is based on various parameters of a
salon chain of hair products. You are expected to do Principal Component Analysis for this case
study according to the instructions given in the following rubric.
Note: This particular dataset contains the target variable satisfaction as well. Please do drop this
variable before doing Principal Component Analysis.
Questions:
1) Perform Exploratory Data Analysis [both univariate and multivariate analysis to be
performed]. The inferences drawn from this should be properly documented. - 5 points
2) Scale the variables and write the inference for using the type of scaling function for this case
study. - 3 points
3) Comment on the comparison between covariance and the correlation matrix after scaling. - 2
points
4) Check the dataset for outliers before and after scaling. Draw your inferences from this
exercise. - 3 points
5) Build the covariance matrix, eigenvalues and eigenvector. - 4 points
6) Write the explicit form of the first PC (in terms of Eigen Vectors) - 5 points
7) Discuss the cumulative values of the eigenvalues. How does it help you to decide on the
optimum number of principal components? What do the eigenvectors indicate? Perform PCA
and export the data of the Principal Component scores into a data frame. - 10 points
8) Mention the business implication of using the Principal Component Analysis for this case
study. - 5 points
The data file Hair Salon :csv contains 12 variables used for Market Segmentation in the context of Product Service Management. Variable Expansion ProdQual Product Quality Ecom E-Commerce TechSup Technical Support CompRes Complaint Resolution Advertising Advertising ProdLine Product Line SalesFImage Salesforce Image ComPricing Competitive Pricing WartyClaim Warranty & Claims OrdBilling Order & Billing DelSpeed Delivery Speed Satisfaction Customer SatisfactionID ProdQual Ecom TechSup CompRes Advertising ProdLine SalesFImage ComPricing WartyClaim OrdBilling |DelSpeed Satisfaction 8.5 3.9 2.5 5.5 4.8 4.9 6 6.8 4.7 8.2 2.7 3.7 5.1 7.2 8.2 3.4 7.9 3.1 5.3 5.5 3.9 4.9 9.2 3.4 5.6 5.7 5.6 5.4 7.4 5.8 4.5 6.2 5.4 4.5 6.4 3.3 3.7 8.9 4.7 4.7 4.5 8.8 7 4.3 3 9 3.4 4.8 5.2 4.6 2.2 6 4.5 6.8 6.1 4.5 3.5 5.5 7.1 2.8 3.1 4.1 4 1.3 3.7 3.5 5.1 3.6 6.9 3.3 3.7 5 4.7 2.6 2.1 2.3 5.4 8.9 4.8 2.1 LD CO 6.2 21 3.3 3.9 4.8 5.7 4.6 3.6 5.1 6.9 5.4 1.3 5.8 3.7 3.6 5.1 6.7 6.3 3.7 5.9 5.8 9.3 5.9 4.4 10 4.6 5.4 4.5 5.1 7 6.1 4.7 5.7 5.7 8.4 5.4 11 4.1 4.4 3.7 3.2 4.6 5.5 4.8 2.7 5.8 4.6 5.8 5.8 3.8 12 4 6.1 7.4 4.9 6.3 3.9 4.4 3.9 6.4 8.2 5.8 13 3 9.5 5.6 3.2 6 1.6 6.9 5 6.9 5.6 7.6 6.5 5.1 14 4.4 9.2 3.9 5.7 8.4 5.5 2.4 8.4 4.8 7.1 6.7 15 4.5 5.3 4.5 4.2 4.7 6.9 7.6 4.5 6.8 5.9 3.8 6 4.8 16 5.2 3.7 8 3.2 4 6.8 3.2 7.8 3.8 1.9 6.1 4.3 4.5 17 5.7 4 6.6 6.7 6 3.3 5.5 5.1 5.2 6.7 18 4.2 4.5 5.9 1.1 5.5 6.4 7.2 3.5 6.4 5.5 3.4 6.2 5.7 19 4.8 5.6 3.4 7.4 5.1 6.4 3.7 5.7 5.6 9.1 5.4 5 20 9.1 4.5 4.5 3.6 6.8 6.4 5.3 5.3 7.1 3.4 5.8 4.5 21 5.2 4.4 3.8 7.1 7.6 5.2 3.9 4.3 5 8.4 7.1 3.3 22 5.7 3.3 3.6 5.8 5.4 5.9 5.4 8.3 7.8 1.5 6.4 1.3 23 3.6 4.3 3.6 9.9 7.4 5.1 3.5 7.3 4.7 3.7 6.7 24 4.8 9.3 4 2.4 2.6 7 7.2 2.2 7.2 1.5 5.2 6.4 6.7 25 4.5 6 4.1 5.3 8.6 4.7 3.5 5.3 5.3 8 6.5 26 4.7 4 5.4 3.6 6.6 6.1 4.8 4 3.9 5.3 7.1 6.1 5.6 27 3.9 8.5 3 7.2 6.6 5.8 1.1 7.6 3.7 4.8 6.9 5.3 4.4 28 7 3.3 5.4 5.5 6.3 2.6 1.8 1.2 9 6.5 4.3 29 3.7 3 5.4 8.5 5.7 6 2.3 7.6 3.7 4.8 5.8 5.7 30 7.6 4.4 3.6 6.3 3 4 5.1 4.2 4.6 7.7 4.9 31 4.7 5.9 3.5 3.4 3.5 5.4 4.3 4.5 5.4 4.7 5.2 7.7 3.7 3.3 32 3.1 2.5 6.1 7.2 4.5 2.3 5.1 3.8 6.6 6.8 33 3 3 6.7 3.7 5.5 6.4 5.3 5.3 5.1 1.9 9.2 5.7 3.5 34 3.4 8 3.3 6.1 5.4 5.7 5.5 4.6 4.7 8.7 35 5.9 4.7 5.7 4 4.2 5.2 7.3 3.9 5.4 5.8 3.4 6.2 2.5 36 8.7 3.5 6.3 3.2 6.1 4.3 3.5 6.1 2.9 5.6 6.1 3.1 2.5 5.4Problem Staten-lent: The Hair Setoncsv' dataset contains various variables used for the context of Market Segmentation. This particular case study is based on various parameters of a salon chain of hair products. You are expected to do Principal Component Analysis for this case study according to the instructions given in the following rubric. Note: This particular dataset contains the target variable satisfaction as well. Please do drop this variable before doing Principal Component Analysis. Questions: 1) Perform Exploratory Data Analysis [both unlvariate and multivariate analysis to be performed]. The inferences drawn from this should be properly documented. 5 points 2) Scale the variables and write the inference for using the type of scaling function for this case study. - 3 points 3) Comment on the comparison between covariance and the correlation matrix after scaling. - 2 points 4) Check the dataset for outliers before and after scaling. Draw your inferences from this exercise. - 3 points 5) Build the covariance matrix. eigenvalues and eigenyectcr. - 4 points 6) 1Write the explicit form of the rst PC (in terms of Eigen Vectors) 5 points 7') Discuss the cumulative values of the eigenvalues. How does it help you to decide on the optimum number of principal components? What do the eigenvectors indicate? Perform PCA and export the data of the Principal Component scores into a data frame. 10 points 8) Mention the business implication of using the Principal Component Analysis for this case study. - 5 points