Question
Dataset: boston
Dataset: boston <- read.csv(
"http://people.bu.edu/kalathur/datasets/bostonCityEarnings.csv", colClasses =
c("character", "character", "character", "integer", "character"))
I need to create a subset of the above data set with only the top 5 departments based on the number of employees working in that department. The top 5 departments should be computed using R code. Then, use the %in% operator to create the required subset.
Use a sample size of 50 for each of the following. Set the start seed for random numbers as 1234
a) Show the sample drawn using simple random sampling without replacement. Show the frequencies for the selected departments. Show the percentages of these with respect to sample size.
b) Show the sample drawn using systematic sampling. Show the frequencies for the selected departments. Show the percentages of these with respect to sample size.
c) Calculate the inclusion probabilities using the Earnings variable. Using these values, show the sample drawn using systematic sampling with unequal probabilities. Show the frequencies for the selected departments. Show the percentages of these with respect to sample size.
d) Order the data using the Department variable. Draw a stratified sample using proportional sizes based on the Department variable. Show the frequencies for the selected departments. Show the percentages of these with respect to sample size.
e) Compare the means of Earnings variable for these four samples against the mean for the data.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started