Question
Assignment 2 : R Programming Upload R code into Week 2 Dropbox. Download insurance.csv file from Doc Sharing. Read into a data frame named insuranceData
Assignment 2 : R Programming
Upload R code into Week 2 Dropbox.
Download insurance.csv file from Doc Sharing. Read into a data frame named insuranceData using data.table() with the following options (check Week 1 lecture notes and Chapter 2 from textbook.)
header=T,stringsAsFactors=F
Data has 7 columns and 1338 rows. The data contains information for health insurance charges based on the age, sex, bmi, number of children, smoking, and the region of the country where the family lives. [w2h1] A) Print the name of the columns. B) Print the number of rows and columns. C) Count the number of males and females in the data.
Hint: Lecture notes have samples on counting items in vectors, e.g., the table() function.
D) Find mean, median,standard deviation, and variance of age and bmi. The R functions to be used are mean(), median(), sd(), var(). E) Find maximum and minimum values of age, bmi, and children. F) Use summary() function to print information about the distribution of the insurance data. What are the min and max values printed by the summary() function for the age, bmi, children, and charges? G)Use summary() function to print distribution information of the age column. Check textbook page 34 for a sample. H) Use unique() function to print the name of distinct regions. I) Extract the subset of insurance data that has three children. Hint: Use subset() function. Check lecture notes and textbook for samples. J) Extract the subset of insurance data with charges more than 30000. K) Extract the subset of insurance data for females living in southwest region. L) Extract the subset of insurance data for males living in northwest region with more than 2 children. M) Use class() function to print the type of R object for each column of the insurance data frame. Hint: Textbook chapter 2. N) Use class() function to print the type of the smoker column. Convert smoker column to a factor type. How many levels are created when you convert the smoker column to factor type? What would be the reason you want to convert the smoker column type from character to a factor type? Hint: You can get information about the levels by just printing the smoker column after conversion. Check lecture notes. O) Use summary() function to print the summary statistics for the smoker column? What is the result of using summary() function on a data type of factor. To see the differences of using the summary() function on different data types print the result of summary for the region, age, and smoker. What are the differences? This is an example to show that summary() function reports different statistics for numeric and categorical data(i.e., factors).
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started