Question
I have followed the answer for the previous question but am getting the below error. Code executed : # Load necessary modules from Bio import
I have followed the answer for the previous question but am getting the below error.
Code executed :
# Load necessary modules from Bio import SeqIO import gzip
# Read in human genome file genome_file = 'hg38.fa.gz' with gzip.open(genome_file, 'rt') as f: genome = list(SeqIO.parse(f, 'fasta'))
# Read in RefSeq table refseq_file = '/users/xxxx/data2' with open(refseq_file, 'r') as f: refseq = list(SeqIO.parse(f, 'tab'))
# Create dictionary of gene sequences gene_dict = {} for record in genome: gene_name = record.id.split()[0] gene_dict[gene_name] = record.seq
# Create dictionary of protein sequences protein_dict = {} for record in refseq: if record.features: for feature in record.features: if feature.type == 'CDS': gene_name = feature.qualifiers['gene'][0] gene_seq = gene_dict.get(gene_name, None) if gene_seq is not None: protein_seq = gene_seq[feature.location.start.position:feature.location.end.position].translate() protein_name = f">{record.id}:{record.name}:{gene_name}:{feature.qualifiers['protein_id'][0]}" protein_dict[protein_name] = protein_seq
# Write output file output_file = 'protein_sequence.fa' with open(output_file, 'w') as f: for protein_name, protein_seq in protein_dict.items(): f.write(f"{protein_name} {protein_seq} "
Error I am getting :
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Input In [2], in() 11 refseq_file = '/users/vijay/data2' 12 with open(refseq_file, 'r') as f: ---> 13 refseq = list(SeqIO.parse(f, 'tab')) 15 # Create dictionary of gene sequences 16 gene_dict = {} File /opt/anaconda3/lib/python3.9/site-packages/Bio/SeqIO/Interfaces.py:72, in SequenceIterator.__next__(self) 70 """Return the next entry.""" 71 try: ---> 72 return next(self.records) 73 except Exception: 74 if self.should_close_stream: File /opt/anaconda3/lib/python3.9/site-packages/Bio/SeqIO/TabIO.py:93, in TabIterator.iterate(self, handle) 90 if line.strip() == "": 91 # It's a blank line, ignore it 92 continue ---> 93 raise ValueError( 94 "Each line should have one tab separating the" 95 + " title and sequence, this line has %i tabs: %r" 96 % (line.count("\t"), line) 97 ) from None 98 title = title.strip() 99 seq = seq.strip() # removes the trailing new line ValueError: Each line should have one tab separating the title and sequence, this line has 11 tabs: 'chr1\t67092164\t67109072\tXM_011541469.2\t0\t-\t67093004\t67103382\t0\t5\t1440,187,70,145,44,\t0,3070,4087,11073,16864, ' |
Please assist .
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access with AI-Powered Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started