Question
Implement in JAVA. Imagine that you are building an online plagiarism checker, which allows teachers in the land of Edutopia to submit papers written by
Implement in JAVA.
Imagine that you are building an online plagiarism checker, which allows teachers in the land of Edutopia to submit papers written by their students and check if any of those students have copied whole sections from a set, D, of documents written in the Edutopian language that you have collected from the Internet. You have at your disposal a parser, P , that can take any document, d, and separate it into a sequence of its n words in their given order (with duplicates included) in O(n) time. You also have a perfect hash function, h, that maps any Edutopian word to an integer in the range from 1 to 1,000,000, with no collisions, in constant time. It is considered an act of plagiarism if any student uses a sequence of m words (in their given order) from a document in D, where m is a parameter set by parliament. Describe a system whereby you can read in an Edutopian document, d, of n words, and test if it contains an act of plagiarism. Your system should process the set of documents in D in expected time proportional to their total length, which is done just once. Then, your system should be able to pro- cess any given document, d, of n words, in expected O(n + m) time (not O(nm) time!) to detect a possible act of plagiarism.
Create a java implementation CheckPlagiarism.java that takes 3 commandline arguments - the corpus filename, the target filename and the length of the match sequence.
The corpus file has the format
For ex
134:"The quick brown fox jumped over the lazy dog"
145: "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc pellentesque turpis lorem, at convallis massa euismod quis. Cras blandit rutrum lacus tempor suscipit. Vestibulum in sagittis sem. Vestibulum id gravida felis. Morbi venenatis interdum purus a tincidunt. Aenean vel maximus magna."
The target file contains the text to be checked
For ex
The quick brown fox ate its breakfast slowly
The length is the minimum match required to prove plagiarism For ex 3
With the above example, your program should say "Plagiarized from 134". If the target were "The quick black fox ate its breakfast slowly" then it should print "Not Plagiarized". You may use standard implementations for string processing and hash tables.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started