Answered step by step
Verified Expert Solution
Question
1 Approved Answer
(10 pts) Consider the problem of detecting similar papers in a history course. If the assignment was on a particular topic, then one would expect
(10 pts) Consider the problem of detecting similar papers in a history course. If the assignment was on a particular topic, then one would expect that many of the key terms related to the topic would appear in most of the essays, but perhaps not in the same order. Let the sequence similarity of two essays be the length of the longest sequence of words such that both essays contain the sequence of words in the same order. For example, if one paper was "four score and seven years ago our fathers brought forth upon this continent, a new nation conceived in liberty, and dedicated to the proposition that all men are created equal." and another paper was "four years and seven days ago, fathers helped created a new nation with liberty and iustice for all on the continent," One sequence I found that seems to have the most words in both essays in right order is: four and seven ago fathers a new nation and all" (10 words). Replacing "all" with "the" gives another 10 words occurring in both sequences in order. Although "continent" and "years" also occur in both papers they cannot be added to the sequence while preserving the sequence's order of occurrence. For this problem you are to create a dynamic programming algorithm that takes a papers P and Q as a lists/arrays of words n and m words respectively (possibly with repeats) and determines a longest sequence of words that occur in both papers in the same order. (a) (4 pts) First focus on the length of the longest sequence of words occurring in both papers in order. Derive a recurrence for this length by considering what can happen with the last words in the papers. What are the boundary conditions for your recurrence? Give a brief rational why the recurrence gives the right values. (b) (3 pts) Give a bottom-up dynamic programming algorithm based on your recurrence that calcu- lates the value of the optimal solution (i.e. the length of the longest sequence of words occurring in both lists in the same order). This algorithm should fill in a table (c) (1 pt) What is the running time of your dynamic programming algorithm? (use asymptotic notation) (d) (2 pts) Describe how to output an actual longest sequence of words of words occurring in both lists in the same order. What is the (asymptotic) worst-case running time of your output method (assuming the table from part (b) has already been calculated) (10 pts) Consider the problem of detecting similar papers in a history course. If the assignment was on a particular topic, then one would expect that many of the key terms related to the topic would appear in most of the essays, but perhaps not in the same order. Let the sequence similarity of two essays be the length of the longest sequence of words such that both essays contain the sequence of words in the same order. For example, if one paper was "four score and seven years ago our fathers brought forth upon this continent, a new nation conceived in liberty, and dedicated to the proposition that all men are created equal." and another paper was "four years and seven days ago, fathers helped created a new nation with liberty and iustice for all on the continent," One sequence I found that seems to have the most words in both essays in right order is: four and seven ago fathers a new nation and all" (10 words). Replacing "all" with "the" gives another 10 words occurring in both sequences in order. Although "continent" and "years" also occur in both papers they cannot be added to the sequence while preserving the sequence's order of occurrence. For this problem you are to create a dynamic programming algorithm that takes a papers P and Q as a lists/arrays of words n and m words respectively (possibly with repeats) and determines a longest sequence of words that occur in both papers in the same order. (a) (4 pts) First focus on the length of the longest sequence of words occurring in both papers in order. Derive a recurrence for this length by considering what can happen with the last words in the papers. What are the boundary conditions for your recurrence? Give a brief rational why the recurrence gives the right values. (b) (3 pts) Give a bottom-up dynamic programming algorithm based on your recurrence that calcu- lates the value of the optimal solution (i.e. the length of the longest sequence of words occurring in both lists in the same order). This algorithm should fill in a table (c) (1 pt) What is the running time of your dynamic programming algorithm? (use asymptotic notation) (d) (2 pts) Describe how to output an actual longest sequence of words of words occurring in both lists in the same order. What is the (asymptotic) worst-case running time of your output method (assuming the table from part (b) has already been calculated)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started