Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Consider three documents (D1, D2, and D3) from document collection: - D1 contains the word vector 5 times and model 15 times. -D2 contains
Consider three documents (D1, D2, and D3) from document collection: - D1 contains the word "vector" 5 times and "model" 15 times. -D2 contains the word "vector" 2 time and "model" 18 times. - D3 contains the "vector" 3 times and "model" 10 times. - "vector" occurs in n = 100 docs of the collection. - "model" occurs in n2 = 50 docs of the collection. -The total number of documents in the collection is N = 1000. (a) Use raw term frequency as the term-weighting scheme and draw the vectors for your three documents. Place the "vector" on the x-axis and the "model" on the y-axis. (b) The user's query consists of two words, "vector" and "model". Add the query vector q = (1,1) to your graph, assuming that both words are equally important. Which document do you think is most similar to the query? Use the cosine similarity measure: sim(d;,q) dja djxq to calculate the similarity between each document and the query (use the same weight as for part a). Show your work for the calculations. (c) Now use the tf-idf term weighting scheme (present if through only raw frequency), and calculate the cosine similarity between each document and the query. Note that in this scheme the weight wi,j of term i in document is given by wi,j = freqi,j In and the weight wiq of term i in query q is given by wi,q (0.5 + 0.5 freqi,q) In = ni N ni Show your work and comment on how your results compare to those in part b. Use only natural logarithm with the base e. (d) How would your answers to parts b and c change if 200 documents in the document collection contain the word "model"? (Note: I'm not looking for new numbers; I just want your explanation for how they would change).
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started