Question
The objective of this project is for modeling of categorical data using R. Two data sets are described below to investigate appropriate models (logistic and
Final Project: Modeling of Categorical Data Using R (100 points) The objective of this project is for modeling of categorical data using R. Two data sets are described below to investigate appropriate models (logistic and log-linear models). Both data sets have the binary responses, but one data set has nine predictors whereas other data set has four predictors. Breast Cancer Data: The data set in the text file “CancerData.txt” comes from a study of breast cancer in Wisconsin. There are 681 cases of potentially cancerous tumors of which 238 are actually malignant. Determining whether a tumor is really malignant is traditionally determined by an invasive surgical procedure. The purpose of this study was to determine whether a new procedure called fine needle aspiration, which draws only a small sample of tissue, could be effective in determining tumors status. The following basic analyses that you have to include in your report of this part are as:
A. Fit a logistic regression model with Class as the response variable and the other nine variables as explanatory variables. Do some descriptive statistics, such as graph. Discuss goodness-of-fit using all different tests and other inference problems about this model, such as use testing and confidence interval procedures to determine which predictors are more significant than others.
B. Check adequacy of the model fit using the residual analysis and comment on whether this model fit the data well.
C. Use a process such as backward elimination or criterion such as AIC to determine the best subset of variables.
D. Obtain a scatter plot of the data with both the fitted logistic response function from part C and a lowess smooth superimposed. Does the fitted logistic response function appear to fit well?
E. Check again adequacy of the model fit in C using the residual analysis and comment on whether this model fit the data well.
F. Use your best model obtained in C to predict the outcome.
Step by Step Solution
3.40 Rating (156 Votes )
There are 3 Steps involved in it
Step: 1
A Fit a logistic regression model with Class as the response variable and the other ...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started