Question
Part 1: Create a two-way table using Gender and Year Enlisted. Use the tapply function to make a quick aggerate calculation: table(airforcedata$`ID Gender`, airforcedata$`Year Enlisted`
Part 1:Create a two-way table using Gender and Year Enlisted. Use the tapply function to make a quick aggerate calculation:
table(airforcedata$`ID Gender`, airforcedata$`Year Enlisted` )
tapply(airforcedata$`Average Wage`, airforcedata$Gender, mean)
tapply(airforcedata$`Average Wage`, airforcedata$Gender, median)
tapply(airforcedata$`Average Wage`, airforcedata$Gender, sd)
Copy and paste the table into your post and comment and discuss your the values.
You need to get a random sample. Random sample of n measurements from the rows of a vector x:
Sample <- sample(x, n, ...)
Then, Random sampling of n rows from the rows of a table (matrix, dataframe) X:
Sample <- X[sample(1:nrow(X), n), ]
Part 2a:Next, you will calculate probabilities.What is the probability that a randomly selected wage from the population,(i.e., the original data), is higher than $65,000? use 4 decimal places. Then interpret your results. Here is some code that can help you with 2a) but then you need to figure out 2b) on your own.
round(length(airforcedata$`Average Wage`[airforcedata$`Average Wage` > 65000]) / length(airforcedata$`Average Wage`), 4)
Part 2b: What percentage of the population wages, (i.e., the original data), are between $70,000 and $80,000 (exclusive)? Display the result as a percentage and interpret your results.
Part 3a: Randomly select 1000 samples; each of size 30 from the population 'Average Wage'.Then calculate the mean and the variance in each of the 1000 samples.Display a density histogram for the sample means and one for the sample variances. Comment on your distribution, graph and numbers. Summarize what you see. Here is some code that will help with 3a).
N <- 1000
# Initialize:
Y <- matrix(nrow=30, ncol=N)
Means <- vector()
Variances <- vector()
for (i in 1:N) {
Y[,i] <- sample(airforcedata$`Average Wage`, 30)
Means[i] <- mean(Y[,i])
Variances[i] <- var(Y[,i])}
hist(Means, freq=F, main ='Distribution of the Sample Means')
lines(density(Means), lwd=1, col='red')
hist(Variances, freq=F, main ='Distribution of the Sample Means')
lines(density(Variances), lwd=1, col='red')
Part 3b: What is the probability that a randomly selected sample mean is greater than $50,000?Use 4 decimal places. Compare with the probability that a randomlyselected population measurement is higher than $50,000. Interpret your results. Here is some code that will help with 3b),but you will need to figure out 3c) on your own.
round(length(Means[Means > 50000]) / length(Means), 4)
round(length(airforcedata$`Average Wage`[airforcedata$`Average Wage` > 50000]) / length(airforcedata$`Average Wage`), 4)
Part 3c: What is the probability that a randomly selected sample mean is between $45000 and $55000?Use 4 decimal places. Compare with the probability that a randomlyselected population measurement is is between $45,000 and $55,000. Interpret your results.
Having the most issues with Rstudio. Can you please assist with some of the Code? Thank you! I dont really understand Part 2
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started