Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Given the following documents and queries: D1: You say goodbye, I say hello D2: You say stop, I say go D3: Hello, hello, you say

Given the following documents and queries: D1: You say goodbye, I say hello D2: You say stop, I say go D3: Hello, hello, you say goodbye D4: I say high, you say low Q1: say hello Q2: you goodbye Specify the vocabulary of tokens/terms using full text indexing and no stemming (ignore capitalization and punctuation), and define an alphabetical token/term order. Construct the following: The document term matrices (document-term matrix contains rows corresponding to the documents and columns corresponding to the terms) based on Binary: only consider whether a term t appears in a document D. Repeated terms in one document are counted as 1 in binary matrices. Raw term frequency. The raw term frequency tf(t in D) is defined as the frequency of a term t appeared in document D. Normalized Term frequency See for an example - http://en.wikipedia.org/wiki/TFIDF Term frequency for a term t in a document D can be normalized by the total number of terms ND in the document. Normalized tf(t in D) = raw term frequency(t in D)/ND. = tf(t in D)/ND. tf-idf weights. The inverse document frequency idf(t) of term t can be defined using this expression: [ln (N/(nj+1)) + 1]), where N is the total number of documents in the index, nj is the document frequency of term t (document frequency is the number of documents that term t appeared in). Thus, for term t in document D: tf-idf (t)= raw term frequency(t) * idf(t) = tf(t in D)*[ln (N/(nj+1)) + 1]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Concepts

Authors: David Kroenke, David Auer, Scott Vandenberg, Robert Yoder

10th Edition

0137916787, 978-0137916788

More Books

Students also viewed these Databases questions

Question

Provide examples of Dimensional Tables.

Answered: 1 week ago