Which of the following distance measures is commonly applied to a frequency-document matrix and why? a. Jaccard
Question:
Which of the following distance measures is commonly applied to a frequency-document matrix and why?
a. Jaccard distance—because text is expressed as binary variables in a frequencydocument matrix.
b. Manhattan distance—because text is expressed quantitatively in a frequencydocument matrix and it avoids the effect of outliers.
c. Euclidean distance—because text is expressed quantitatively in a frequencydocument matrix and outliers have been filtered out by upper and lower limits on term frequencies.
d. Cosine distance—because it identifies similarity in term usage patterns instead of the magnitudes in term frequency measures.
Step by Step Answer:
Business Analytics
ISBN: 9780357902219
5th Edition
Authors: Jeffrey D. Camm, James J. Cochran, Michael J. Fry, Jeffrey W. Ohlmann