Answered step by step
Verified Expert Solution
Question
1 Approved Answer
title: Homework 4 author: Your name here date: 'Assigned: February 8, 2018' output: html_document: theme: paper highlight: tango toc: true toc_depth: 3 fig_width: 5 fig_height:
title: "Homework 4" author: "Your name here" date: 'Assigned: February 8, 2018' output: html_document: theme: paper highlight: tango toc: true toc_depth: 3 fig_width: 5 fig_height: 5 --- ### Homework outline This homework is designed to give you practice with calculating error bars (confidence intervals) with ddply and using ggplot2 graphics to produce insightful plots of the results. ```{r, message = FALSE} library(plyr) library(dplyr) library(ggplot2) ``` You will continue using the `adult` data set that you first encountered on Homework 3. This data set is loaded below. ```{r} adult.data <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", header=FALSE, fill=FALSE, strip.white=T, col.names=c("age", "type_employer", "fnlwgt", "education", "education_num","marital", "occupation", "relationship", "race","sex", "capital_gain", "capital_loss", "hr_per_week","country", "income")) adult.data <- mutate(adult.data, high.income = as.numeric(income == ">50K")) ``` ### Problem 1: Calculating and plotting error bars for a 1-sample t-test #### (a) Using `ddply` and 1-sample t-testing, construct a table that shows the average `capital_gain` across `education`, along with the lower and upper endpoints of a 95% confidence interval. Your table should look something like: ``` education mean lower upper 1 10th 404.5745 91.893307 717.2557 2 11th 215.0979 144.306937 285.8888 3 12th 284.0878 126.824531 441.3510 ... ``` ```{r} # Edit me ``` #### (b) Reorder the levels of the factor in your summary table to correspond to ascending order of education. E.g., Preschool is the lowest, 1st-4th the next lowest, etc. You may find the `factor(..., levels = ...)` command helpful here. For the post-high school grades, you can use the ordering: Assoc-voc, Assoc-acdm, Some-college, Bachelors, Masters, Prof-school, Doctorate. ```{r} # Edit me ``` ### Problem 2: (Continuing from Problem 1) #### (a) Using your table from Problem 1(b) Construct a bar chart showing education on the x-axis, and the average capital gainst on the y axis. Use `geom_errorbar` to overlay error bars as specified by the confidence interval endpoints you computed. You should tilt your x-axis text to limit overlap of x-axis labels. Set an appropriate y-axis label. ```{r} # Edit me ``` #### (b) What can you conclude about the association between capital gains and education levels? Does there appear to be a statistically significant difference in capital gains across education? "#0033cc"> Your answer goes here!
Please help with this assignment Jupiter notebook r studio
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started