Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Vector-Space Retrieval Model Consider the following document-term table with 10 documents and 8 terms (A through H) containing raw term frequencies. We also have a

Vector-Space Retrieval Model Consider the following document-term table with 10 documents and 8 terms (A through H) containing raw term frequencies. We also have a specified query, Q, with the indicated raw term weights (the bottom row in the table). Answer the following questions, and in each case give the formulas you used to perform the necessary computations. Note: You should do this problem using a spreadsheet program such as Microsoft Excel. Alternatively, you can write a program to perform the computations. Please include your worksheets or code in the assignment submission). [Download the table below as an Excel Spreadsheet] A B C D E F G H ----------------------------------------------- DOC1 0 3 4 0 0 2 4 0 DOC2 5 5 0 0 4 0 4 3 DOC3 3 0 4 3 4 0 0 5 DOC4 0 7 0 3 2 0 4 3 DOC5 0 1 0 0 0 5 4 2 DOC6 2 0 2 0 0 4 0 1 DOC7 3 5 3 4 0 0 4 2 DOC8 0 3 0 0 0 4 4 2 DOC9 0 0 3 3 3 0 0 1 DOC10 0 5 0 0 0 4 4 2 ---------------------------------------------- Query 2 1 1 0 2 0 3 0 Compute the ranking score for each document based on each of the following query-document similarity measures (sort the documents in the decreasing order of the rank score): dot product Cosine similarity Dice's coefficient Jaccard's Coefficient Compare the ranking obtained when, instead, binary term weights are used to the ranking obtained in part a where raw term weights were used (do this only with dot product as the similarity measure). Explain any discrepancy between the two rankings. Construct a similar table to above, but instead of raw term frequencies compute the (non-normalized) tfxidf weights for the terms. Then compute the ranking scores using the Cosine similarity. Explain any significant differences between the ranking you obtained here and the Cosine ranking from the previous part.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Trap Doors And Trojan Horses An Auditing Action Adventure

Authors: D. Larry Crumbley, David Kerr, Veronica Paz, Lawrence Smith

1st Edition

1531021573, 978-1531021573

More Books

Students also viewed these Accounting questions

Question

What is the difference between absolute and relative pay?

Answered: 1 week ago