Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

(10 pts) Consider the problem of detecting similar papers in a history course. If the assignment was on a particular topic, then one would expect

image text in transcribed

(10 pts) Consider the problem of detecting similar papers in a history course. If the assignment was on a particular topic, then one would expect that many of the key terms related to the topic would appear in most of the essays, but perhaps not in the same order. Let the sequence similarity of two essays be the length of the longest sequence of words such that both essays contain the sequence of words in the same order. For example, if one paper was "four score and seven years ago our fathers brought forth upon this continent, a new nation conceived in liberty, and dedicated to the proposition that all men are created equal." and another paper was "four years and seven days ago, fathers helped created a new nation with liberty and iustice for all on the continent," One sequence I found that seems to have the most words in both essays in right order is: four and seven ago fathers a new nation and all" (10 words). Replacing "all" with "the" gives another 10 words occurring in both sequences in order. Although "continent" and "years" also occur in both papers they cannot be added to the sequence while preserving the sequence's order of occurrence. For this problem you are to create a dynamic programming algorithm that takes a papers P and Q as a lists/arrays of words n and m words respectively (possibly with repeats) and determines a longest sequence of words that occur in both papers in the same order. (a) (4 pts) First focus on the length of the longest sequence of words occurring in both papers in order. Derive a recurrence for this length by considering what can happen with the last words in the papers. What are the boundary conditions for your recurrence? Give a brief rational why the recurrence gives the right values. (b) (3 pts) Give a bottom-up dynamic programming algorithm based on your recurrence that calcu- lates the value of the optimal solution (i.e. the length of the longest sequence of words occurring in both lists in the same order). This algorithm should fill in a table (c) (1 pt) What is the running time of your dynamic programming algorithm? (use asymptotic notation) (d) (2 pts) Describe how to output an actual longest sequence of words of words occurring in both lists in the same order. What is the (asymptotic) worst-case running time of your output method (assuming the table from part (b) has already been calculated) (10 pts) Consider the problem of detecting similar papers in a history course. If the assignment was on a particular topic, then one would expect that many of the key terms related to the topic would appear in most of the essays, but perhaps not in the same order. Let the sequence similarity of two essays be the length of the longest sequence of words such that both essays contain the sequence of words in the same order. For example, if one paper was "four score and seven years ago our fathers brought forth upon this continent, a new nation conceived in liberty, and dedicated to the proposition that all men are created equal." and another paper was "four years and seven days ago, fathers helped created a new nation with liberty and iustice for all on the continent," One sequence I found that seems to have the most words in both essays in right order is: four and seven ago fathers a new nation and all" (10 words). Replacing "all" with "the" gives another 10 words occurring in both sequences in order. Although "continent" and "years" also occur in both papers they cannot be added to the sequence while preserving the sequence's order of occurrence. For this problem you are to create a dynamic programming algorithm that takes a papers P and Q as a lists/arrays of words n and m words respectively (possibly with repeats) and determines a longest sequence of words that occur in both papers in the same order. (a) (4 pts) First focus on the length of the longest sequence of words occurring in both papers in order. Derive a recurrence for this length by considering what can happen with the last words in the papers. What are the boundary conditions for your recurrence? Give a brief rational why the recurrence gives the right values. (b) (3 pts) Give a bottom-up dynamic programming algorithm based on your recurrence that calcu- lates the value of the optimal solution (i.e. the length of the longest sequence of words occurring in both lists in the same order). This algorithm should fill in a table (c) (1 pt) What is the running time of your dynamic programming algorithm? (use asymptotic notation) (d) (2 pts) Describe how to output an actual longest sequence of words of words occurring in both lists in the same order. What is the (asymptotic) worst-case running time of your output method (assuming the table from part (b) has already been calculated)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Making Databases Work The Pragmatic Wisdom Of Michael Stonebraker

Authors: Michael L. Brodie

1st Edition

1947487167, 978-1947487161

More Books

Students also viewed these Databases questions

Question

Explain the nature of human resource management.

Answered: 1 week ago

Question

Write a note on Quality circles.

Answered: 1 week ago

Question

Describe how to measure the quality of work life.

Answered: 1 week ago