Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

1. (45 pts) Recall that the string alignment problem takes as input two strings z and y, composed of symbols xi, yi , for a

image text in transcribed

1. (45 pts) Recall that the string alignment problem takes as input two strings z and y, composed of symbols xi, yi , for a fixed symbol set , and returns a minimal-cost set of edit operations for transforming the string r into string y Let r contain n symbols, let y contain n symbols, and let the set of edit operations be those defined in the lecture notes (substitution, insertion, deletion, and transposition) Let the cost of indel be 1, the cost of swap be 10 (plus the cost of the two sub ops) and the cost of sub be 10, except when r-y, which is a "no-op" and has cost In this problem, we will implement and apply three functions. (i) alignStrings (x,y) takes as input two ASCII strings and y, and runs a dynamic programming algorithm to return the cost matrix S, which contains the optimal costs for all the subproblems for aligning these two strings // x,y are ASCII strings I/ for memoizing the subproblem costs //fill in the basecases alignStrings(x,y): S-table of length nx by ny initialize S for i = 1 to nx for J 1 tony Si,j]- cost(i,j) II optimal cost for xto..i] and ytO..j] return S (ii) extractAlignment (S,x,y) takes as input an optimal cost matrix S, strings r, y, and returns a vector a that represents an optimal sequence of edit operations to convert r into y. This optimal sequence is recovered by finding a path on the implicit DAG of decisions made by alignStrings to obtain the value SInz, ny], starting from ST0, 0] extractAlignment (S,x,y) I/ S is an optimal cost matrix from alignStrings initialize a i,j[nx, ny] while i> 0 or i>0 // empty vector of edit operations // initialize the search for a path to S to,0] a(] determineOptima10pCS, i.jx,y) // what was an optimal choice? [i.j] - updateIndices (S,i,j,a) /1 move to next position return a When storing the sequence of edit operations in a, use a special symbol to denote no-ops (iii) commonSubstrings (x,L,a) which takes as input the ASCII string r, an integer 1 3 Ln, and an optimal sequence a of edits to r, which would transform into y. This function returns each of the substrings of length at least L in x that aligns exactly, via a run of no-ops, to a substring in y (c) (15 pts extra credit) Describe an algorithm for countin the number of optimal alignments, given an optimal cost matrix S. Prove that your algorithm is correct, and give is asymptotic running time Hint: Convert this problem into a form that allows us to apply an algorithm we've already seen 1. (45 pts) Recall that the string alignment problem takes as input two strings z and y, composed of symbols xi, yi , for a fixed symbol set , and returns a minimal-cost set of edit operations for transforming the string r into string y Let r contain n symbols, let y contain n symbols, and let the set of edit operations be those defined in the lecture notes (substitution, insertion, deletion, and transposition) Let the cost of indel be 1, the cost of swap be 10 (plus the cost of the two sub ops) and the cost of sub be 10, except when r-y, which is a "no-op" and has cost In this problem, we will implement and apply three functions. (i) alignStrings (x,y) takes as input two ASCII strings and y, and runs a dynamic programming algorithm to return the cost matrix S, which contains the optimal costs for all the subproblems for aligning these two strings // x,y are ASCII strings I/ for memoizing the subproblem costs //fill in the basecases alignStrings(x,y): S-table of length nx by ny initialize S for i = 1 to nx for J 1 tony Si,j]- cost(i,j) II optimal cost for xto..i] and ytO..j] return S (ii) extractAlignment (S,x,y) takes as input an optimal cost matrix S, strings r, y, and returns a vector a that represents an optimal sequence of edit operations to convert r into y. This optimal sequence is recovered by finding a path on the implicit DAG of decisions made by alignStrings to obtain the value SInz, ny], starting from ST0, 0] extractAlignment (S,x,y) I/ S is an optimal cost matrix from alignStrings initialize a i,j[nx, ny] while i> 0 or i>0 // empty vector of edit operations // initialize the search for a path to S to,0] a(] determineOptima10pCS, i.jx,y) // what was an optimal choice? [i.j] - updateIndices (S,i,j,a) /1 move to next position return a When storing the sequence of edit operations in a, use a special symbol to denote no-ops (iii) commonSubstrings (x,L,a) which takes as input the ASCII string r, an integer 1 3 Ln, and an optimal sequence a of edits to r, which would transform into y. This function returns each of the substrings of length at least L in x that aligns exactly, via a run of no-ops, to a substring in y (c) (15 pts extra credit) Describe an algorithm for countin the number of optimal alignments, given an optimal cost matrix S. Prove that your algorithm is correct, and give is asymptotic running time Hint: Convert this problem into a form that allows us to apply an algorithm we've already seen

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Medical Image Databases

Authors: Stephen T.C. Wong

1st Edition

1461375398, 978-1461375395

More Books

Students also viewed these Databases questions

Question

5. Do you have any foreign language proficiency?

Answered: 1 week ago