Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Using RapidMiner, build a logistic regression model, using 70% of the entire data for training and the other 30% for testing (using shuffled sampling in
Using RapidMiner, build a logistic regression model, using 70% of the entire data for training and the other 30% for testing (using shuffled sampling in RM), to classify patients into those who are likely to have a stroke and those who are not. Use all the available variables as predictors except id and the default thresholds by RM to generate the classifications. [40 pts.] (a) Which independent variable(s) are significant at p=0.05? (b) How to interpret the coefficients of the following two variables in this logistic regression: (i) age, and (ii) hypertension? Explain in terms of the effect of a unit change of the independent variable on the odds of the dependent variable and show steps of your derivation to get full credits. For example, how does a unit increase in age affect the odds of stroke? Explain not just whether it affects the odds positively or negatively, but also by how much. Similarly, how does hypertension affect the odds of stroke? (c) Evaluate the predictive accuracy of the model using appropriate metrics (e.g., specificity, sensitivity, precision, false positive rate, false negative rate, AUC, etc.) that you learned in this week. (Do not just provide the numbers; offer your own
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started