Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider three documents (D1, D2, and D3) from document collection: - D1 contains the word vector 5 times and model 15 times. -D2 contains

 

Consider three documents (D1, D2, and D3) from document collection: - D1 contains the word "vector" 5 times and "model" 15 times. -D2 contains the word "vector" 2 time and "model" 18 times. - D3 contains the "vector" 3 times and "model" 10 times. - "vector" occurs in n = 100 docs of the collection. - "model" occurs in n2 = 50 docs of the collection. -The total number of documents in the collection is N = 1000. (a) Use raw term frequency as the term-weighting scheme and draw the vectors for your three documents. Place the "vector" on the x-axis and the "model" on the y-axis. (b) The user's query consists of two words, "vector" and "model". Add the query vector q = (1,1) to your graph, assuming that both words are equally important. Which document do you think is most similar to the query? Use the cosine similarity measure: sim(d;,q) dja djxq to calculate the similarity between each document and the query (use the same weight as for part a). Show your work for the calculations. (c) Now use the tf-idf term weighting scheme (present if through only raw frequency), and calculate the cosine similarity between each document and the query. Note that in this scheme the weight wi,j of term i in document is given by wi,j = freqi,j In and the weight wiq of term i in query q is given by wi,q (0.5 + 0.5 freqi,q) In = ni N ni Show your work and comment on how your results compare to those in part b. Use only natural logarithm with the base e. (d) How would your answers to parts b and c change if 200 documents in the document collection contain the word "model"? (Note: I'm not looking for new numbers; I just want your explanation for how they would change).

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Economics questions

Question

600 lb 20 0.5 ft 30 30 5 ft

Answered: 1 week ago