Question
Problem 1 (explore the data): This exercise relates to the College data set, which can be found in the file College.csv (http://www-bcf.usc.edu/~gareth/ISL/data.html). It contains a
Problem 1 (explore the data):
This exercise relates to the College data set, which can be found in the file College.csv (http://www-bcf.usc.edu/~gareth/ISL/data.html). It contains a number of variables for 777 different universities and colleges in the US. Before reading the data into R, it can be viewed in Excel or a text editor.
a) Use the read.csv() function to read the data into R. Call the loaded data college. Make sure that you have the directory set to the correct location for the data. We need to eliminate the first column in the data where the names are stored. Try:
> college=college[,-1]
> fix(college)
b)
Use the summary() function to produce a numerical summary of the
variables in the data set.
Use the pairs() function to produce a scatterplot matrix of the first ten
columns or variables of the data. Recall that you can reference the first ten columns of a matrix A using A[,1:10].
What is the range of each quantitative variable? You can answer this using the range() function.
What is the mean and standard deviation of each quantitative variable?
Use the hist() function to produce some histograms with differing numbers of bins for a few of the quantitative variables. You may find the command par(mfrow=c(2,2)) useful: it will divide the print window into four regions so that four plots can be made simultaneously. Modifying the arguments to this function will divide the screen in other ways.
To save a graphic as a pdf file:
>pdf(file.pdf) >dev.off()
Instead of pdf you may use jpeg or png.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started