1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability to specify residue classes. As one example, the temperature at which a DNA molecule "melts" (i.e. the two strands of the double helix separate) is determined not only by its length longer strands melt at higher temperatures but also by the proportion of "weak" (A or T) vs. "strong" (C or G) bases that it containts. Strands of a given length with a higher proportion of strong residues melt at higher temperatures. In this problem, we want to search a DNA database for matches to a pattern P in which each character is either a single DNA base or one of the symbols W or S, denoting the weak and strong classes respectively. A symbol W or S in the pattern matches either element of its class. For example, the pattern AWG would match either of the text strings AAG or ATG, while the pattern SSSS would match any of the 16 possible 4-mers composed entirely of C's and G's. (a) We can easily extend the definition of sp, values in the KMP algorithm to include prefix-suffix matches in which one or both substrings contain residue classes. Show, however, that if the pattern contains a residue class at even a single position, the pattern-shifting rule used by the KMP algorithm can yield incorrect answers. Explain why the rule fails. (b) Describe how to extend the basic KMP algorithm to work correctly using a pattern with a residue class at exactly one position. Hint: make your sp-values conditional on which character matched the S or W. Argue that your revised method is still correct. Don't worry about explaining how to compute sp-values in this revised algorithm, but do show that, once you have these values, your revised algorithm preserves KMP's property of performing at most 27 comparisons to the text. (c) How much space (asymptotically) do you need to store sp-values for a pattern with residue classes in k positions? Justify your answer. 1. (20%) When performing pattern matching in DNA, a useful type of imprecision is the ability to specify residue classes. As one example, the temperature at which a DNA molecule "melts" (i.e. the two strands of the double helix separate) is determined not only by its length longer strands melt at higher temperatures but also by the proportion of "weak" (A or T) vs. "strong" (C or G) bases that it containts. Strands of a given length with a higher proportion of strong residues melt at higher temperatures. In this problem, we want to search a DNA database for matches to a pattern P in which each character is either a single DNA base or one of the symbols W or S, denoting the weak and strong classes respectively. A symbol W or S in the pattern matches either element of its class. For example, the pattern AWG would match either of the text strings AAG or ATG, while the pattern SSSS would match any of the 16 possible 4-mers composed entirely of C's and G's. (a) We can easily extend the definition of sp, values in the KMP algorithm to include prefix-suffix matches in which one or both substrings contain residue classes. Show, however, that if the pattern contains a residue class at even a single position, the pattern-shifting rule used by the KMP algorithm can yield incorrect answers. Explain why the rule fails. (b) Describe how to extend the basic KMP algorithm to work correctly using a pattern with a residue class at exactly one position. Hint: make your sp-values conditional on which character matched the S or W. Argue that your revised method is still correct. Don't worry about explaining how to compute sp-values in this revised algorithm, but do show that, once you have these values, your revised algorithm preserves KMP's property of performing at most 27 comparisons to the text. (c) How much space (asymptotically) do you need to store sp-values for a pattern with residue classes in k positions? Justify your answer.
Expert Answer:
Answer rating: 100% (QA)
a When using the KnuthMorrisPratt KMP algorithm the patternshifting rule relies on the concept of the longest proper suffix of the substring that is a... View the full answer
Related Book For
Managing Human Resources
ISBN: 978-8522104291
12th Edition
Authors: Susan E Jackson, Randall S Schuler, Steve Werner
Posted Date:
Students also viewed these programming questions
-
10 8.2 8.3 7.9 9.2 7.8 8.5 8.6 8.8 9.1 8.2 7.9 7.9 7.7 8.5 8.4 7.5 9.8 9.6 9.5 7.7 7.5 7.6 8.4 8.1
-
CANMNMM January of this year. (a) Each item will be held in a record. Describe all the data structures that must refer to these records to implement the required functionality. Describe all the...
-
Portray in words what transforms you would have to make to your execution to some degree (a) to accomplish this and remark on the benefits and detriments of this thought.You are approached to compose...
-
1. (Adverse Selection) Consider a labor market model with many identical firms hiring workers. The firms produce a homogeneous product with a constant-returns-to-scale technology and act as price...
-
From the following transactions for Edna Co., when appropriate, journalize, record, post, and prepare a schedule of accounts receivable. Use the same journal headings (all page 1) and chart of...
-
Valhalla Furniture Emporium Ltd. (Valhalla) sells poor-quality furniture at low prices. Customers take delivery of their furniture after making a down payment of 10 percent of the selling price. The...
-
A stock has an expected retum of 22 percent, and a beta of 1.6, and the expected return on the market is 16 percent. What must the risk-free rate be?
-
CVP analysis is both simple and simplistic if you want realistic analysis to underpin your decisions, look beyond CVP analysis. Do you agree? Explain.
-
2 of 2 ! Required information [The following information applies to the questions displayed below.] Timberly Construction makes a lump-sum purchase of several assets on January 1 at a total cash...
-
Chumpy Lighting Limited manufactures a wide variety of light bulbs which it sells to lighting shops and builders merchants through wholesale distributors. It also sells direct to the big UK...
-
Consider the languages L= (albmalxm|1,m0) and Lo= {abm|1,m>0}. (Recall from the previous exercise that we would state that L is non-context-free). Which is the simplest model of computation one may...
-
A car starts the motion and after 6 second its velocity reachs to 7 5 mi / hr . Then this car coast for 3 minutes. Then driver pushes the brake and car deccelerates by 4 m / s 2 unitl it stops. Find...
-
Now consider the junk bond financing alternative. a. Construct a pro forma income statements for 1993 for the two financing alternatives. b. What are the times-interest-earned, fixed charge coverage,...
-
Problem 9 [Absolute value function] Solve the inequality and write the solution in interval notation 1+x > 2 x (Hint: Consider two cases: when x > 0 and x < 0.) Problem 10 [The Absolute Value...
-
The income statement for the month of June, 2016 of Snap Shot, Inc. contains the following information: Revenues $7,300 Expenses: Salaries and Wages Expense $3,000 Rent. Expense 1,300 Advertising...
-
Topic: Optimal Risky Portfolio Questions: What is the process of developing an optimal portfolio of risky assets? When developing an optimal portfolio, how are the risky assets selected to be...
-
Gale, McLean, and Lux are partners of Burgers and Brew Company with capital balances as follows: Gale, $80,000; McLean, $60,000; and Lux, $140,000. The partners share profit and losses in a 3:2:5...
-
Grace is training to be an airplane pilot and must complete five days of flying training in October with at least one day of rest between trainings. How many ways can Grace schedule her flying...
-
Jam Manufacturing Inc. has beginning work in process \($27,200\), direct materials used \($240,000\), di- rect labor \($200,000\), total manufacturing overhead \($150,000\), and ending work in...
-
Gene Toni claims that the distinction between directed indirect materials is based entirely on physical association with the product. Is Gene correct? Why?
-
Jane Diaz is confused about the differences between a product cost and a period cost. Explain the differ- ences to Jane.
Study smarter with the SolutionInn App