Answered step by step
Verified Expert Solution
Question
1 Approved Answer
1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability to specify residue classes. As one example, the
1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability to specify residue classes. As one example, the temperature at which a DNA molecule "melts" (i.e. the two strands of the double helix separate) is determined not only by its length longer strands melt at higher temperatures but also by the proportion of "weak" (A or T) vs. "strong" (C or G) bases that it containts. Strands of a given length with a higher proportion of strong residues melt at higher temperatures. In this problem, we want to search a DNA database for matches to a pattern P in which each character is either a single DNA base or one of the symbols W or S, denoting the weak and strong classes respectively. A symbol W or S in the pattern matches either element of its class. For example, the pattern AWG would match either of the text strings AAG or ATG, while the pattern SSSS would match any of the 16 possible 4-mers composed entirely of C's and G's. (a) We can easily extend the definition of sp, values in the KMP algorithm to include prefix-suffix matches in which one or both substrings contain residue classes. Show, however, that if the pattern contains a residue class at even a single position, the pattern-shifting rule used by the KMP algorithm can yield incorrect answers. Explain why the rule fails. (b) Describe how to extend the basic KMP algorithm to work correctly using a pattern with a residue class at exactly one position. Hint: make your sp-values conditional on which character matched the S or W. Argue that your revised method is still correct. Don't worry about explaining how to compute sp-values in this revised algorithm, but do show that, once you have these values, your revised algorithm preserves KMP's property of performing at most 27 comparisons to the text. (c) How much space (asymptotically) do you need to store sp-values for a pattern with residue classes in k positions? Justify your answer.
Step by Step Solution
★★★★★
3.50 Rating (153 Votes )
There are 3 Steps involved in it
Step: 1
a When using the KnuthMorrisPratt KMP algorithm the patternshifting rule relies on the concept of the longest proper suffix of the substring that is a...Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started