Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

DNA sequence alignment Recall the Edit Distance (Sequence Alignment) problem: given two strings over the same alphabet and mismatch and gap penalties, nd an alignment

DNA sequence alignment Recall the Edit Distance (Sequence Alignment) problem: given two strings over the same alphabet and mismatch and gap penalties, nd an alignment of minimal cost. One of the most common uses of the minimum edit distance algorithm is in computational biology. DNA sequences are composed of four amino-acids, denoted by the letters A, C, T, G. Mutation over the course of evolution changes the sequences by deleting, inserting, or substituting amino-acids. The smaller the edit distance between some two sequences, the smaller the evolutionary distance between them. Thus, biological sequences are often aligned to minimize the edit distance between them. (a) Suppose the costs of mismatches and gaps are not the same for all the letters in the DNA sequences (since some mutations are more common than others). The following table states the cost of each operation for each letter: A C T G - A 0 .1 .1 .2 .1 C .2 0 .2 .3 .1 T .2 .1 0 .1 .2 G .1 .2 .2 0 .1 - .2 .3 .1 In the substitution table, the entry in row A, column C, is the penalty AC for A to C mismatch (but not vise versa) and the entry in row A, column -, is the cost of aligning A atop a gap. Find the minimum edit distance AND the optimal alignment between the following sequences (show the matrix of your calculations): Sequence 1 (top): G A T T A C A Sequence 2 (bottom):A T T A A C (b) Extra credit: 2 points Suppose there are occasional errors in sequencing and some of the letters in the DNA sequence are represented by a ' ?' as a result. The ' ?' character can be matched to any letter in the alignment but not to a gap. i. Modify the Sequence Alignment algorithm to account for the ' ?' charac- ter. Give the recurrence relation and the polynomial dynamic programming algorithm. ii. Assume the mismatch and gap costs are as in the previous question. Find the minimum edit distance AND the optimal alignment between the following sequences (show the matrix of your calculations): Sequence 1: G A T ? T A C A Sequence 2: A T T A C ?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Semantics In Databases Second International Workshop Dagstuhl Castle Germany January 2001 Revised Papers Lncs 2582

Authors: Leopoldo Bertossi ,Gyula O.H. Katona ,Klaus-Dieter Schewe ,Bernhard Thalheim

2003rd Edition

3540009574, 978-3540009573

More Books

Students also viewed these Databases questions

Question

What is brainstorming?

Answered: 1 week ago

Question

Define the usability testing.

Answered: 1 week ago