Data analysts are often confronted with a set of few measured independent variables and are to choose
Question:
Data analysts are often confronted with a set of few measured independent variables and are to choose the “best” predictive equation.
Not infrequently, such an analysis consists of taking the measured variables, their pairwise cross-products, and their squares; throwing the whole lot into a computer; and using a variable selection method
(usually stepwise regression) to select the best model. Using the data for the urinary calcium study in Problems 2.6, 3.8, 5.5, and 6.4 (data in Table D-5, Appendix D), do such an analysis using several variable selection methods. Do the methods agree? Is there a problem with multicollinearity, and, if so, to what extent can it explain problems with variable selection? Does centering help? Is this “throw a whole bunch of things in the computer and stir” approach useful?
Step by Step Answer:
Primer Of Applied Regression And Analysis Of Variance
ISBN: 9780071824118
3rd Edition
Authors: Stanton Glantz, Bryan Slinker, Torsten Neilands