3.8 Exercises 1. Given three DNA sequences S1, S2, and S3 of total length n. aagatgt....
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
3.8 Exercises 1. Given three DNA sequences S1, S2, and S3 of total length n. aagatgt. What is the (b) Describe an efficient algorithm that computes the length of the longest common substring. What is the time complexity? (a) Suppose Si acgatca, S2 = gattact, S3 longest common substring of S1, S2, and S3? PRO (c) It is possible that the longest common substring for S₁, S2, and S3 is not unique. Can you describe an efficient algorithm to report the number of possible longest common substrings of S₁, S2, and S3? What is the time complexity? 2. Please give the generalized suffix tree for S₁ = ACGTS and S₂ TGCA#. 3. Given a DNA sequence S and a pattern P, can you describe an O(|P|² + |S) time algorithm to find all occurrences of P in S with hamming distance <1? 4. Consider the string S = ACGT ACGTS. (a) What is the suffix array for S? (b) Report the values of (1) LCP(k, k + 1) for k= 1, 2, ...,8 and (2) LCP (k, k+4) for k = 1, 2, 3, 4. (For 3., Hamming distance of <= 1 means that we should look for a substring M in S such that IM| = |P| and there is at most one mismatch between M and P.) Problem 5 (10 pts) Programming project: Write a program to compute the suffix array for a given input string (you may use any programming language). Your program should read in a FASTA file (like HW2). You can assume that the file just contains a single DNA sequence, e.g. > seql ACTGGGAAATCGAAGACCCGG Remember to add a $' to the end of the string. The output should just be the suffix array table, e.g. SA [1] = x SA [2] = y etc. Bonus question (1 pt): Implement a binary search based pattern matching algorithm in the above program that searches the suffix array for a given pattern. 3.8 Exercises 1. Given three DNA sequences S1, S2, and S3 of total length n. aagatgt. What is the (b) Describe an efficient algorithm that computes the length of the longest common substring. What is the time complexity? (a) Suppose Si acgatca, S2 = gattact, S3 longest common substring of S1, S2, and S3? PRO (c) It is possible that the longest common substring for S₁, S2, and S3 is not unique. Can you describe an efficient algorithm to report the number of possible longest common substrings of S₁, S2, and S3? What is the time complexity? 2. Please give the generalized suffix tree for S₁ = ACGTS and S₂ TGCA#. 3. Given a DNA sequence S and a pattern P, can you describe an O(|P|² + |S) time algorithm to find all occurrences of P in S with hamming distance <1? 4. Consider the string S = ACGT ACGTS. (a) What is the suffix array for S? (b) Report the values of (1) LCP(k, k + 1) for k= 1, 2, ...,8 and (2) LCP (k, k+4) for k = 1, 2, 3, 4. (For 3., Hamming distance of <= 1 means that we should look for a substring M in S such that IM| = |P| and there is at most one mismatch between M and P.) Problem 5 (10 pts) Programming project: Write a program to compute the suffix array for a given input string (you may use any programming language). Your program should read in a FASTA file (like HW2). You can assume that the file just contains a single DNA sequence, e.g. > seql ACTGGGAAATCGAAGACCCGG Remember to add a $' to the end of the string. The output should just be the suffix array table, e.g. SA [1] = x SA [2] = y etc. Bonus question (1 pt): Implement a binary search based pattern matching algorithm in the above program that searches the suffix array for a given pattern.
Expert Answer:
Answer rating: 100% (QA)
Lets address each of the exercises a Given S1 acgatca S2 gattact and S3 aagatgt we need to find the longest common substring among them In this case t... View the full answer
Related Book For
Accounting for Decision Making and Control
ISBN: 978-1259564550
9th edition
Authors: Jerold Zimmerman
Posted Date:
Students also viewed these programming questions
-
Luann Lawyer began working at 8:00A this morning. She called her client, Maizie Ruth Hill. They spoke for twenty (20) minutes about Maizie's case against Mike O'Dell. After the call ended, Luann...
-
Use the accompanying sinking fund formula to determine the payment needed to reach the accumulated amount. Monthly payments with 5% interest are compounded monthly for 22 years to accumulate $510,000.
-
1. What are insulators and conductors? How are they different from one another? 2. What is a capacitor? How does it work? How is it different from a battery? 3. What are the different parts in a...
-
Figure 6.6 shows the derivative g'. If g(0) = 0, graph g. Give (x, y)-coordinates of all local maxima and minima. -g'(x)- 2 6. -1 Figure 6.6 5
-
What types of power are related most strongly to leadership effectiveness?
-
Explain the difference between cost of goods sold and cost of goods manufactured.
-
\(\sqrt{396}\) Simplify the square root by expressing it in lowest terms.
-
Department and activity-cost rates, service sector Radhikas Radiology Center (RRC) performs x-rays, ultrasounds, CT scans, and MRls. RRC has developed a reputation as a top Radiology Center in the...
-
INTERMEDIATE ACCOUNTING 1/AQUISITION OF PPE xx2 1.A donated plant asset for which the fair value has been determined and for which directly attributable costs have been incurred is reported at a.Fair...
-
There are N houses (numbered from 0 to N-1) along a street. In each of them, recyclable trash (plastic, glass, metal) is collected into separate bags. There are three trucks that collect the trash....
-
How much less is a perpetuity of $2,000 worth than an annuity due of the same amount for 30 payments (in $ dollars)? Assume an interest rate of 10%. $
-
The Laplace transform method cannot be used to determine the response of a system with proportional damping. Indicate whether the statement presented is true or false. If true, state why. If false,...
-
Why is the impulsive response of a system with motion input not defined?
-
What is the response spectrum of a pulse?
-
What is meant by the approximation of a pulse being short duration?
-
The convolution integral can be derived using Laplace transforms or variation of parameters. Indicate whether the statement presented is true or false. If true, state why. If false, rewrite the...
-
Hello, I have a few questions regarding forward contracts and no-arbitrage pricing. Any guidance and explanation is appreciated. Thank you! Suppose you have a long position in a stock index and you...
-
In muscle tissue, the ratio of phosphorylase a to phosphorylase b determines the rate of conversion of glycogen to glucose 1phosphate. Classify how each event affects the rate of glycogen breakdown...
-
The Allington Screen Plant of Allington Windows manufactures new and replacement screens. The plant produces more than 500 different screen sizes and offers four different aluminum frame colors....
-
Berkman Financial is a regional bank that offers a variety of financial services to both retail and commercial customers. Berkman uses the step-down allocation method to allocate four service...
-
Stahl produces and sells a single product and faces an inelastic demand curve, meaning it can sell as many units as it wants without affecting the selling price. Stahl has a cost structure consisting...
-
Compare and contrast direct and indirect real estate investments.
-
Explain what is meant by the term collectible, and give some examples of collectible items you or your family may own.
-
Describe precious metal and gem investments and why they remain popular despite their speculative nature.
Study smarter with the SolutionInn App