Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. (45 pts) Recall that the string alignment problem takes as input two strings z and y, composed of symbols xi, yi , for a
1. (45 pts) Recall that the string alignment problem takes as input two strings z and y, composed of symbols xi, yi , for a fixed symbol set , and returns a minimal-cost set of edit operations for transforming the string r into string y Let r contain n symbols, let y contain n symbols, and let the set of edit operations be those defined in the lecture notes (substitution, insertion, deletion, and transposition) Let the cost of indel be 1, the cost of swap be 10 (plus the cost of the two sub ops) and the cost of sub be 10, except when r-y, which is a "no-op" and has cost In this problem, we will implement and apply three functions. (i) alignStrings (x,y) takes as input two ASCII strings and y, and runs a dynamic programming algorithm to return the cost matrix S, which contains the optimal costs for all the subproblems for aligning these two strings // x,y are ASCII strings I/ for memoizing the subproblem costs //fill in the basecases alignStrings(x,y): S-table of length nx by ny initialize S for i = 1 to nx for J 1 tony Si,j]- cost(i,j) II optimal cost for xto..i] and ytO..j] return S (ii) extractAlignment (S,x,y) takes as input an optimal cost matrix S, strings r, y, and returns a vector a that represents an optimal sequence of edit operations to convert r into y. This optimal sequence is recovered by finding a path on the implicit DAG of decisions made by alignStrings to obtain the value SInz, ny], starting from ST0, 0] extractAlignment (S,x,y) I/ S is an optimal cost matrix from alignStrings initialize a i,j[nx, ny] while i> 0 or i>0 // empty vector of edit operations // initialize the search for a path to S to,0] a(] determineOptima10pCS, i.jx,y) // what was an optimal choice? [i.j] - updateIndices (S,i,j,a) /1 move to next position return a When storing the sequence of edit operations in a, use a special symbol to denote no-ops (iii) commonSubstrings (x,L,a) which takes as input the ASCII string r, an integer 1 3 Ln, and an optimal sequence a of edits to r, which would transform into y. This function returns each of the substrings of length at least L in x that aligns exactly, via a run of no-ops, to a substring in y (c) (15 pts extra credit) Describe an algorithm for countin the number of optimal alignments, given an optimal cost matrix S. Prove that your algorithm is correct, and give is asymptotic running time Hint: Convert this problem into a form that allows us to apply an algorithm we've already seen 1. (45 pts) Recall that the string alignment problem takes as input two strings z and y, composed of symbols xi, yi , for a fixed symbol set , and returns a minimal-cost set of edit operations for transforming the string r into string y Let r contain n symbols, let y contain n symbols, and let the set of edit operations be those defined in the lecture notes (substitution, insertion, deletion, and transposition) Let the cost of indel be 1, the cost of swap be 10 (plus the cost of the two sub ops) and the cost of sub be 10, except when r-y, which is a "no-op" and has cost In this problem, we will implement and apply three functions. (i) alignStrings (x,y) takes as input two ASCII strings and y, and runs a dynamic programming algorithm to return the cost matrix S, which contains the optimal costs for all the subproblems for aligning these two strings // x,y are ASCII strings I/ for memoizing the subproblem costs //fill in the basecases alignStrings(x,y): S-table of length nx by ny initialize S for i = 1 to nx for J 1 tony Si,j]- cost(i,j) II optimal cost for xto..i] and ytO..j] return S (ii) extractAlignment (S,x,y) takes as input an optimal cost matrix S, strings r, y, and returns a vector a that represents an optimal sequence of edit operations to convert r into y. This optimal sequence is recovered by finding a path on the implicit DAG of decisions made by alignStrings to obtain the value SInz, ny], starting from ST0, 0] extractAlignment (S,x,y) I/ S is an optimal cost matrix from alignStrings initialize a i,j[nx, ny] while i> 0 or i>0 // empty vector of edit operations // initialize the search for a path to S to,0] a(] determineOptima10pCS, i.jx,y) // what was an optimal choice? [i.j] - updateIndices (S,i,j,a) /1 move to next position return a When storing the sequence of edit operations in a, use a special symbol to denote no-ops (iii) commonSubstrings (x,L,a) which takes as input the ASCII string r, an integer 1 3 Ln, and an optimal sequence a of edits to r, which would transform into y. This function returns each of the substrings of length at least L in x that aligns exactly, via a run of no-ops, to a substring in y (c) (15 pts extra credit) Describe an algorithm for countin the number of optimal alignments, given an optimal cost matrix S. Prove that your algorithm is correct, and give is asymptotic running time Hint: Convert this problem into a form that allows us to apply an algorithm we've already seen
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started