Consider the following text version of a post to an online learning forum in a statistics course:

Question:

Consider the following text version of a post to an online learning forum in a statistics course:image text in transcribed

a. Identify 10 non-word tokens in the passage.

b. Suppose that this passage constitutes a document to be classified, but you are not certain of the business goal of the classification task. Identify material (at least \(20 \%\) of the terms) that, in your judgment, could be discarded fairly safely without knowing that goal.

c. Suppose that the classification task is to predict whether this post requires the attention of the instructor, or whether a teaching assistant might suffice. Identify the \(20 \%\) of the terms that you think might be most helpful in that task.

d. What aspect of the passage is most problematic from the standpoint of simply using a bag-of-words approach, as opposed to an approach in which meaning is extracted?

Fantastic news! We've Found the answer you've been seeking!

Step by Step Answer:

Related Book For  book-img-for-question
Question Posted: