Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Python 3 Only Problem 2. (Recovering the Alignment) Now that we know how to compute the edit distance between two strings, we next want to

Python 3 Only

Problem 2. (Recovering the Alignment) Now that we know how to compute the edit distance between two strings, we next want to recover the optimal alignment itself. The key idea is to retrace the steps of the dynamic programming algorithm backwards, re-discovering the path of choices (highlighted in red in the table above) from opt[0][0] to opt[M][N]. To determine the choice that led to opt[i][j], we consider the three possibilities: 1. The optimal alignment matches x[i] up with a gap. In this case, we must have opt[i][j] = opt[i + 1][j] + 2. 2. The optimal alignment matches y[j] up with a gap. In this case, we must have opt[i][j] = opt[i][j + 1] + 2. 3. The optimal alignment matches x[i] up with y[j]. In this case, we must have opt[i][j] = opt[i + 1][j + 1] if x[i] equals y[j], or opt[i][j] = opt[i + 1][j + 1] + 1 otherwise.

Write a program alignment.py that reads from standard input, the output produced by edit_distance.py, ie, input strings x and y, and the opt matrix. The program should then recover an optimal alignment using the procedure described above, and write to standard output the edit distance between x and y and the alignment itself, using the following format:

The rst line should contain the edit distance, preceded by the text Edit distance = .

Each subsequent line should contain a character from the rst string, followed by the paired character from the second string, followed by the associated penalty. Use the character - to indicate a gap in either string. $ python3 edit_distance.py < data/example10.txt | python3 alignment.py

Edit distance = 7

A T 1

A A 0

C - 2

A A 0

G G 0

T G 1

T T 0

A - 2

C C 0

C A 1 Data The data directory contains short test data les and actual genomic data les. Be sure to test your programs thoroughly using the short test les and the longer actual data les. Here are the optimal edit distances of several of the supplied les: ecoli2500.txt 118

ecoli5000.txt 160

fli8.txt 6 fli9.txt 4

fli10.txt 2

ftsa1272.txt 758

gene57.txt 8

stx1230.txt 521

stx19.txt 10

stx26.txt 17

stx27.txt 19

and here are the test cases with unique optimal alignments:

$ python3 edit_distance.py < data/endgaps7.txt | python3 alignment.py

Edit distance = 4

a - 2

t t 0

a a 0

t t 0

t t 0

a a 0

t t 0

- a 2

$ python3 edit_distance.py < data/fli10.txt | python3 alignment.py

Edit distance = 2

T T 0

G G 0

G G 0

C T 1

G G 0

G G 0

A T 1

A A 0

C C 0

T T 0

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Professional IPhone And IPad Database Application Programming

Authors: Patrick Alessi

1st Edition

0470636173, 978-0470636176

More Books

Students also viewed these Databases questions

Question

Have I incorporated my research into my outline effectively?

Answered: 1 week ago