Question
This python program supposed to look for Open Reading Frame( string starts with ATG and end with either TAA, TGA or TAG. It also supposed
This python program supposed to look for Open Reading Frame( string starts with ATG and end with either TAA, TGA or TAG.
It also supposed to count the length (how many characters in the string). However, it doesn't work. Why?
import re import string
with open('dna.txt', 'rb') as f: data=f.read (GAGTTTTATCGCTTCCATGACGCAGAAGTTAACACTTTCGGAATGATGAAAAA)
data = [x.split(' ', 1) for x in data.split('>')] data = [(x[0], ''.join(x[1].split())) for x in data if len(x) == 2]
start, end = [re.compile(x) for x in 'ATG TAG|TGA|TAA'.split()]
revtrans = string.maketrans("ATGC","TACG")
def get_longest(starts, ends): ''' Simple brute-force for now. Optimize later... Given a list of start locations and a list of end locations, return the longest valid string. Returns tuple (length, start position)
Assume starts and ends are sorted correctly from beginning to end of string. ''' results = {} # Use smallest end that is bigger than each start ends.reverse() for start in starts: for end in ends: if end > start and (end - start) % 3 == 0: results[start] = end + 3 results = [(end - start, start) for start, end in results.iteritems()] return max(results) if results else (0, 0)
def get_orfs(dna): ''' Returns length, header, forward/reverse indication, and longest match (corrected if reversed)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started