4. The purpose of this exercise is to develop a model to predict forest cover type using...
Question:
4. The purpose of this exercise is to develop a model to predict forest cover type using a number of cartographic measures. The given dataset (Online File W6.1)
includes four wilderness areas found in the Roosevelt National Forest of northern Colorado. A total of 12 cartographic measures were utilized as independent variables; seven major forest cover types were used as dependent variables. The following table provides a short description of these independent and dependent variables:
This is an excellent example for a multiclass classification problem. The dataset is rather large (with 581,012 unique instances) and feature rich. As you will see, the data is also raw and skewed (unbalanced for different cover types). As a model builder, you are to make necessary decisions to preprocess the data and build the best possible predictor.
Use your favorite tool to build the models and document the details of your actions and experiences in a written report. Use screenshots within your report to illustrate important and interesting findings. You are expected to discuss and justify any decision that you make along the way.
The reuse of this dataset is unlimited with retention of copyright notice for Jock A. Blackard and Colorado State University.
Step by Step Answer:
Decision Support And Business Intelligence Systems
ISBN: 9780136107293
9th Edition
Authors: Dursun Delen Efraim Turban, Ramesh Sharda