Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Consider the attached dataset on 1 3 0 2 American colleges and universities offering an undergraduate program and answer the following questions by applying the
Consider the attached dataset on American colleges and universities offering an undergraduate program and answer the following questions by applying the Python code bits used in class. Feel free in fact you are encouraged to consult me for any guidance.
You must demonstrate how you answered each question in the order asked below by submitting your Jupiter notebook file along with a Word file.
How many variables are there in the dataset?
Which variables are categorical, which are numerical?
Clean the dataset by removing all missingincomplete observations. How many complete observations are left?
Set row names indices to the College Name column.
Clean up the variable names by
a Replacing the following characters and spaces with for removal: # $
b Replacing the following characters with :
Compute the summary statistics of the numerical variables in the dataset.
Plot a histogram for each of the numerical variables by setting the axis labels in plain English to make it easy to understand
Construct a heatmap between all numerical variables and comment on the relationships among them.
By observing the heatmap, select three numerical variables that you think would be interesting to include and draw a matrix scatter diagram between the three.
Convert the categorical variables into integer binary dummy variables.
a Explain in words, for one observation, the values in the derived binary dummies.
Conduct a principal components analysis PCA using only the original numerical variables.
a Make sure to display the \'Standard Deviation\', \'Proportion of Variance\' and \'Cumulative Proportion\' info.
b Comment on the results: How many principal components appear to be significant? Should the data be normalized beforehand?
Normalize the numerical variables using the standard scaler and redo question Comment on the difference in the PCA results after normalization.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started