Answered step by step
Verified Expert Solution
Question
1 Approved Answer
### Name of File Name your assignment file * * ` BRFSS _ Part 1 ` * * . This is a Quarto markdown file,
### Name of File
Name your assignment file BRFSSPart This is a Quarto "markdown" file, which has the file has the extension qmd
### Data Set
These data come from the Centers for Disease Control and Preventionwwwcdcgov
To answer these questions you will need to use the codebook on Brightspace, called BRFSS Codebook For part of the project, please note that not all of the variables listed in the codebook are included in the csv file to be downloaded from Brightspace.
Download the brfsscsv file from Brightspace and place it in the same folderdirectory as your script file. Then in RStudio, set your Working Directory to your Source File location: in the menus choose Session Set Working Directory To Source File Location. You most likely will see some warnings after it loads due to the fact that readcsv will try to guess the column type but because there are so many rows it won't read enough of them to accurately make a guess.
You must use the readcsv function when loading the csv file. Do not use read.csv
Do not rename the csv file that you download from Brightspace.
Do not edit the csv file.
### Preliminaries
r
rmlist ls
librarytidyverse
librarypsych
librarylmbeta
# This will take a few moments to load since the file is so large.
brf readcsvbrfsscsv showcoltypes FALSE
## Questions
### Q: We will be analyzing three variables described below in part of this project. Identify the names of the variables indicated below using the CodeBook provided on Brightspace. Using the data brfssCSV data provided on Brightspace, create a dataframe brfpart with only these three columns in the order they were mentioned above which you will use for the following questions. Do not rename the variables. Store the first rows in Q
a variable that measures how often the respondent eats fruit not including juices
a variable that records the length of time since last routine medical checkup
a variable that records the general health of the respondent.
Once you have created the new brfpart you might consider removing the original dataframe from your environment to save space with removebrf If you do this, however, and you run Q again, it will likely error since you removed brf
We encourage you to take note of the values of each of these three variables and familiarize yourself with them before continuing.
Hint: Your brfpart dataframe should have the same number of rows as the original brf but now only columns.
r
### Do not edit the following line. It is used by CodeGrade.
# CG Q #
### TYPE YOUR CODE BELOW ###
### VIEW OUTPUT ###
Q
## Cleaning
### Q: Clean the dataframe brfpart by removing the respondents who "refused", said "don't knownot sure" and any NAs from both the health variable and the length of time variable. See the CodeBook for details on what the values of the variables mean. Overwrite the existing brfpart Sort the resulting dataframe by the general health variable from excellent health to poor health Store the first ten rows of the resulting dataframe as Q
In practice, it would be wise to create a new dataframe, but we are trying to save space for CodeGrade and on your local device.
Hint: The resulting brfpart dataframe is x
r
### Do not edit the following line. It is used by CodeGrade.
# CG Q #
### TYPE YOUR CODE BELOW ###
### VIEW OUTPUT ###
Q
### Q: How many people and what percentage reported that in general their health is either good or very good? Your answer should be a dataframe with two values: the number and the percentage. Round the percentage to the nearest tenth. Store it as Q
The percentage is out of the total number of observations for the brfpart dataset.
Hint: The answer should look like this note the column names:
Count Percent
r
### Do not edit the following line. It is used by CodeGrade.
# CG Q #
### TYPE YOUR CODE BELOW ###
### VIEW OUTPUT ###
Q
### Q: Create a dataframe showing the number and the proportion of individuals who said their health is excellent, very good or good for each of the different lengths of times since last checkup. Store as a dataframe named Q Round to three decimal places.
The percentage is out of the total number of observations for the brfpart dataset. If your proportion does not match below, double check your Q cleaning.
Hint: The x dataframe should look like this. The is the name of the length of time variable.
n proportion
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started