In this problem, you will use the data and scenario described in this chapters example, in which

Question:

In this problem, you will use the data and scenario described in this chapter’s example, in which the task is to develop a model to classify documents as either auto-related or electronics-related.

a. Using the process shown in Figure 21.6, store the data as an ExampleSet. Then, load the data in a new process and create a label vector.

b. Following the example in this chapter, preprocess the documents. Explain what would be different if you did not perform the “stemming” step.

c. Use the LSA to create 10 concepts. Explain what is different about the concept matrix, as opposed to the TF-IDF matrix.

d. Using this matrix, fit a predictive model (different from the model presented in the chapter illustration) to classify documents as autos or electronics. Compare its performance with that of the model presented in the chapter illustration.

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question

Machine Learning For Business Analytics

ISBN: 9781119828792

1st Edition

Authors: Galit Shmueli, Peter C. Bruce, Amit V. Deokar, Nitin R. Patel

Question Posted: