Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Assignment 2 : R Programming Upload R code into Week 2 Dropbox. Download insurance.csv file from Doc Sharing. Read into a data frame named insuranceData

Assignment 2 : R Programming

Upload R code into Week 2 Dropbox.

Download insurance.csv file from Doc Sharing. Read into a data frame named insuranceData using data.table() with the following options (check Week 1 lecture notes and Chapter 2 from textbook.)

header=T,stringsAsFactors=F

Data has 7 columns and 1338 rows. The data contains information for health insurance charges based on the age, sex, bmi, number of children, smoking, and the region of the country where the family lives. [w2h1] A) Print the name of the columns. B) Print the number of rows and columns. C) Count the number of males and females in the data.

Hint: Lecture notes have samples on counting items in vectors, e.g., the table() function.

D) Find mean, median,standard deviation, and variance of age and bmi. The R functions to be used are mean(), median(), sd(), var(). E) Find maximum and minimum values of age, bmi, and children. F) Use summary() function to print information about the distribution of the insurance data. What are the min and max values printed by the summary() function for the age, bmi, children, and charges? G)Use summary() function to print distribution information of the age column. Check textbook page 34 for a sample. H) Use unique() function to print the name of distinct regions. I) Extract the subset of insurance data that has three children. Hint: Use subset() function. Check lecture notes and textbook for samples. J) Extract the subset of insurance data with charges more than 30000. K) Extract the subset of insurance data for females living in southwest region. L) Extract the subset of insurance data for males living in northwest region with more than 2 children. M) Use class() function to print the type of R object for each column of the insurance data frame. Hint: Textbook chapter 2. N) Use class() function to print the type of the smoker column. Convert smoker column to a factor type. How many levels are created when you convert the smoker column to factor type? What would be the reason you want to convert the smoker column type from character to a factor type? Hint: You can get information about the levels by just printing the smoker column after conversion. Check lecture notes. O) Use summary() function to print the summary statistics for the smoker column? What is the result of using summary() function on a data type of factor. To see the differences of using the summary() function on different data types print the result of summary for the region, age, and smoker. What are the differences? This is an example to show that summary() function reports different statistics for numeric and categorical data(i.e., factors).

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Intranet And Web Databases For Dummies

Authors: Paul Litwin

1st Edition

0764502212, 9780764502217

More Books

Students also viewed these Databases questions

Question

What are Measures in OLAP Cubes?

Answered: 1 week ago

Question

How do OLAP Databases provide for Drilling Down into data?

Answered: 1 week ago

Question

How are OLAP Cubes different from Production Relational Databases?

Answered: 1 week ago