Question
I need to update my script to include the percentage of overall amino acids compared to the total, then only display the top 5. The
I need to update my script to include the percentage of overall amino acids compared to the total, then only display the top 5. The script it set up to process a fasta file already it's just the display of the results that I want to change. Please do not give a totally different script, others have tried this and given me bad scripts that don't even work.
My python script:
#!/usr/bin/env python3
def FASTA(filename):
try:
f = open(filename)
except IOError:
print ("The file, %s, does not exist" % filename)
return
order = []
sequences = {}
counts = {}
for line in f:
if line.startswith('>'):
name = line[1:].rstrip(' ')
#name = name.replace('_', ' ')
order.append(name)
sequences[name] = ''
else:
sequences[name] += line.rstrip(' ').rstrip('*')
for aa in sequences[name]:
if aa in counts:
counts[aa] = counts[aa] + 1
else:
counts[aa] = 1
print ("%d sequences found" % len(order))
print (counts)
return (order, sequences)
x, y = FASTA("/home/jorvis1/e_coli_k12_dh10b.faa")
I need the output to look like this (instead of currently only showing the count) where it includes the percentage for the total amino acid sequence (only need the top 5 amino acids with highest percentage):
L: 139002 (10.7%)
A: 123885 (9.6%)
G: 95475 (7.4%)
V: 91683 (7.1%)
I: 77836 (6.0%)
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started