Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Please write a function that takes input of an institution of higher learning and a vector of variables and outputs a multipanel plot showing histograms

Please write a function that takes input of an institution of higher learning and a vector of variables and outputs a multipanel plot showing histograms of those variables with the position of the specified institution marked in the histogram. The plots should be based on the 2015-16 College Scorecard data.
Choose a list of four quantitative variables that you think would be interesting to plot. It should show plots for all four variables by default, but allow a user to select a smaller subset of the variables. The inputs of your function should also include two variables that specify whether the histograms should include all institutions or only include institutions that match the target institution in the type of control (public, private-not-for-profit, or for-profit) and level (4-year, 2-year, less-than-2-year).
Note that some variables are not defined for all types of institutions. For example, completion rate has separate variables for four-year and less-than-four-year institutions. Thus, the histogram for this variable always includes only the same type of institutions as the target institution. Your plots should have descriptive labels and titles. The function should give a descriptive error message if the inputs are not correct (e.g. the institution name or the variables are not known).
I have attempted the code in R below but I believe it has some issues still. The output of histograms does not show the frequency well and the institution selected at the end 'University of Alabama' is not visible as specified in the question. Please help me resolve these issues and ensure I am answering the question to completion in R. After this, please replicate the exact same function in Python and provide code.
#### Function
```{r}
plot_histogram - function(INSTNM, variables = c("TUITIONFEE_IN", "TUITIONFEE_OUT", "AVGFACSAL", "FAMINC"),
include_all = TRUE, control_type = NULL, level = NULL){
# Read the dataset
collegedata - read.csv("MERGED2015_16_PP.csv",na.strings=c("","NA", "NULL","PrivacySuppressed"))
# Check if institution_name exists in the dataset
if (!INSTNM %in% collegedata$INSTNM){
stop("Institution name not found in the dataset.")
}
# Check if variables are valid
valid_variables - c("TUITIONFEE_IN", "TUITIONFEE_OUT", "AVGFACSAL", "FAMINC")
if (!all(variables %in% valid_variables)){
stop("Invalid variable(s) specified.")
}
# Filter data based on control type and level if include_all is FALSE
if (!include_all){
if (is.null(control_type)|| is.null(level)){
stop("Both control_type and level must be specified when include_all is FALSE.")
}
collegedata - subset(collegedata, CONTROL == control_type & LEVEL == level)
}
# Subset data for the specified institution
institution_data - subset(collegedata, INSTNM == INSTNM)
# Check if institution exists
if (nrow(institution_data)==0){
stop("Institution not found in the specified control type and level.")
}
# Adjust plot margins
par(mar = c(3,4,2,2))
# Create multipanel plot
par(mfrow = c(length(variables),1))
for (variable in variables){
# Check if variable exists in the dataset
if (!(variable %in% names(collegedata))){
warning(paste("Variable", variable, "not found in the dataset. Skipping..."))
next
}
hist_data - collegedata[[variable]]
institution_value - institution_data[[variable]]
# Plot histogram
hist(hist_data, main = paste("Histogram of", variable), xlab = variable,
ylab = "Frequency", col = "lightblue", border = "black", ylim = c(0, max(hist(hist_data, plot = FALSE)$counts)*2.1))
# Add vertical line for institution value
abline(v = institution_value, col = "blue", lwd =2)
}
# Reset plotting parameters
par(mfrow = c(1,1))
par(mar = c(5,4,4,2)+0.1) # Reset plot margins to default
}
```
#### Testing the plot_histogram function
```{r}
hist -plot_histogram("University of Alabama", include_all = TRUE)
Please do all of this in R and Python as simply as possible, without the use of any complex code and provide annotation which helps explain
image text in transcribed

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

More Books

Students also viewed these Databases questions

Question

Provide examples of Dimensional Tables.

Answered: 1 week ago