Question: Imagine that you are building an online plagiarism checker, which allows teachers in the land of Edutopia to submit papers written by their students and

Imagine that you are building an online plagiarism checker, which allows teachers in the land of Edutopia to submit papers written by their students and check if any of those students have copied whole sections from a set, D, of documents written in the Edutopian language that you have collected from the Internet. You have at your disposal a parser, P, that can take any document, d, and separate it into a sequence of its n words in their given order (with duplicates included) in O(n) time. You also have a perfect hash function, h, that maps any Edutopian word to an integer in the range from 1 to 1,000,000, with no collisions, in constant time. It is considered an act of plagiarism if any student uses a sequence of m words (in their given order) from a document in D, where m is a parameter set by parliament. Describe a system whereby you can read in an Edutopian document, d, of n words, and test if it contains an act of plagiarism. Your system should process the set of documents in D in expected time proportional to their total length, which is done just once. Then, your system should be able to process any given document, d, of n words, in expected O(n+m) time (not O(nm) time!) to detect a possible act of plagiarism.

Step by Step Solution

3.32 Rating (161 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

Use a hash table of size at least 2N where N is the total length of the doc... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Data Structures Algorithms Questions!