Question
INFERENCES FOR CATEGORICAL DATA In this lab assignment, you will use descriptive, graphical, and inferential tools in R (or R Commander) to analyze the data
INFERENCES FOR CATEGORICAL DATA
In this lab assignment, you will use descriptive, graphical, and inferential tools in R (or R Commander) to analyze the data related to the passengers of the British ocean linerTitanicthat sank in 1912 after colliding with an iceberg. You will display and summarize the related categorical variables and explore the relationship between them with contingency tables. The significance of the bivariate relationships will also be assessed. Tests and confidence intervals for proportions will be used to compare the survival rates in selected passenger groups.
The Titanic Disaster
On April 15, 1912, during her maiden voyage, the British ocean linerTitanic, the largest ship afloat at the time, sank in the North Atlantic Ocean after colliding with an iceberg, sadly killing the vast majority of 2,224 passengers and crew. One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew.
In this lab assignment, you will use a dataset that describes gender, age, passenger class, and survival status of 1,187 of the 1,309 passengers on the Titanic. You may see that some groups of passengers were more likely to have survived than others. The data does not contain information for 885 crew members, but it does contain actual and estimated ages for about 80% of the passengers. Any passenger under 12 years of age was classified as a "child".
This dataset is based on theTitanic Passenger List, edited by Michael A. Findlay and originally published in Eaton & Haas (1994)Titanic: Triumph and Tragedy, Patrick Stephens Ltd, and expanded with the help of the internet community.
This dataset is available in theDatalink located in the Lab 3 tab display in the Labs section on eClass. Please import the data into R. (Hint: Students should use "Tabs" as "Field Separator" to import the data set into R Commander.) The data are not to be printed in your submission. The following is a description of the variables in the data file:
Variable Name Description of Variable
NAME PCLASS : full name of the passenger;
SURVIVED: passenger class (1 = 1st, 2 = 2nd, 3 = 3rd); used as proxy for socio-economic status (SES) 1st Upper class, 2nd Middle class, 3rd Lower class;
GENDER: gender (female or male)
AGE: age (in years); fractional if age is less than 1 year, NA for not available.
1. Use the data to answer Questions 1 - 5. 1. First discuss the data in the file.
(a) How many cases are there? What is the identifier variable? What is/are the categorical and numerical variable(s) in the data, if any?
(b) Is this an observational study or an experiment? Can the results of the study be extended to the population of interest which is all ships colliding with an iceberg? Are causal inferences possible?
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started