Question
The gene GALK1 in human encodes the enzyme galactokinase 1. This enzyme enables the body to process a simple sugar called galactose, which is present
The gene GALK1 in human encodes the enzyme galactokinase 1. This enzyme enables the body to process a simple sugar called galactose, which is present in small amounts in many foods. Galactose is primarily part of a larger sugar called lactose, which is found in all dairy products and many baby formulas. # # Supposed you have already the file GALK1.txt which contains the DNA sequence of the gene GALK1 in your current working directory of your computer.
1) Define a function that transcribe the GALK1 gene into its corresponding mRNA, then print out the resulted sequence of the obtained mRNA. Function should be `transcribe(seq,pos)`, which `pos` is the list of (start,end) tuples.
# The coding sequence - CDS (the sequence that codes for the corresponding protein products) of the gene GALK1 is from position 57 to 1235 of the mRNA. You have already the RNA codon table. # # 2) Write a Python function to translate the mRNA of the gene GALK1 into its corresponding amino acid sequence, giving the start and end of the Coding sequence of the mRNA, then print the resulted sequence of the amino acid sequence. # `translate(seq,start,end)`.
~~~~~~ENd of questions~~~~~~~~~
The GALK1.txt:
>NG_008079.2:5008-12306 Homo sapiens galactokinase 1 (GALK1), RefSeqGene on chromosome 17 ATCCCGCGCCGACGGGGCTGTGCCGGAGCAGCTGTGCAGAGCTGCAGGCGCGCGTCATGGCTGCTTTGAG ACAGCCCCAGGTCGCGGAGCTGCTGGCCGAGGCCCGGCGAGCCTTCCGGGAGGAGTTCGGGGCCGAGCCC GAGCTGGCCGTGTCAGCGCCGGGCCGCGTCAACCTCATCGGGGAACACACGGACTACAACCAGGGCCTGG TGCTGCCTATGGTGAGGGGCTGCACGGGGAGCCCCTAGCCCGCCGCCGCCTGTCCCGGCCGCCGAGGAGG GCGGGCCTCGGGGACGCTGGGGGCGAGTTCTTCCCGCGGGAGATGTGGGGCGGGCAGCTGCGCCTGGAGC ACCGGTGCACGGAAGAGTCCCCGGGACAGGCTGTTCCCCACGTTGGAAGGGAGGAAGCGAGGAAGTGGCC GGGAGAGGGTGCGCGGCCGCCTCTTGGCTCAAGCCCGCCCTCTGGGGGCTGGGGCTCCTCGCCTTCAACC TGGGAGCATGTTCCCCTTAAACTGTGAGGCCCTGTGTGCCACGCAGAAGGGGACACTCCGCGCCTCCGGC CACCGTGGGGCCCCAACCGCAGACCTGGGCGAACGTAGCCTTCTGGCCCAGCCCGTTCAATTTACAGAGG AGGAAACTGAGGCCTAGAGAGGCCCAGTGAACTGCTGGAGGTCACACAGCAGGTTCTTGGCGGGGCTGCG ACTTGGGAGTGAGGACTCCCAGCTTTCAGCGGGGGGCGCTTTCCGCCCCATCTGCAGCTTGGGGAGTGCA CAGGTACAGGATGTCCAGAGCCACCCCAAAATGTAAAGGCTTTGGAGCTCCAGTGATCTGTTTTCCCTTT GGGCTAGCTCTCCCCCTTGCCCCACAGCTCAGGGCAGAGTCCAGGTCTGTGCTCCAGCTGCAGCCGCCCC GCCCCTGAAGACCTAAGGGGGCAGGGCTCAAGCCCCCAAGGTCAGCTGGCCCTCAGGATCTTCCCTGCGA CGCTGAACCTGGAGGTTCAGAACCTGATGACTGTGGAGGCATCAGAACCTCGGCTGGAGGCAGTGTCATT GGAGAGGCTTACTCCAGCTGGCGGAAGCCTCACGTACTGCTTGTCTCTCCTGCCAGGCTCTGGAGCTCAT GACGGTGCTGGTGGGCAGCCCCCGCAAGGATGGGCTGGTGTCTCTCCTCACCACCTCTGAGGGTGCCGAT GAGCCCCAGCGGCTGCAGTTTCCACTGCCCACAGCCCAGCGCTCGCTGGAGCCTGGGACTCCTCGGTGGG CCAACTATGTCAAGGGAGTGATTCAGTACTACCCAGGTATGGGGCCCAGGCCTGAGCCAAGTCCTCACTG ATACTAGGAGTGCCACCTCACAGCCACAGAGCCCATTCATTTGTCTGATACACTGTGGGGAAGGCTTGTA GAGTGGAGCATCCCATTGTACAGATGAGGAAACTGATGCCCCCAGAAGGTCGGGAACTTGCCCTGGGTTT CCCGTGACCTGATTGGAGGAGCCAGGATTTGAACCCCAGCCTTTTTTCCCTCCAGAGCCCTAAACCAGGA GGACAATTAGAAGTGTCCCAGCAACCTCAGAGGGTGGGAAAATGGAGGGCAGTGGGTCCCTTGGCCCAGC AGGTTGGTGGCTTCTGACAATTGAGACACACACCCTAGAAACAGCTGCTAGGCCGTTGCTGCCCTTCCCG CCAGGACACCTGCCCTTCCTGTGCCATCCTCCCAGGCAGCCCCTCTTACCATCACCTGTTCTTTCCCCTG CAGCTGCCCCCCTCCCTGGCTTCAGTGCAGTGGTGGTCAGCTCAGTGCCCCTGGGGGGTGGCCTGTCCAG CTCAGCATCCTTGGAAGTGGCCACGTACACCTTCCTCCAGCAGCTCTGTCCAGGTACCAGCTAGGCCCCA GCCCTGACCCAGCCCTCCTTCCCTGAGGTCTCCAGGTGGTCCCAGCTTCTACTATGCCTTATGGAGGGGG TGGCAGGGACTCTCCCTGGAGTGTCATTGAAGCCACTGCTGCTTCCACCAGCCCTAGCCTCCCCACCTCA CCCTGTACTGCAGACTCGGGCACAATAGCTGCCCGCGCCCAGGTGTGTCAGCAGGCCGAGCACAGCTTCG CAGGGATGCCCTGTGGCATCATGGACCAGTTCATCTCACTTATGGGACAGAAAGGCCACGCGCTGCTCAT TGACTGCAGGTTGGGCTCGCTCCCCTCGTCCCCTCCCGCCCTGCACTCAGCAGCTCCTGGGTGGGAGTGT GCCCACTGCCTGGCGCAGCAAGCACACGCTTGGCCTCGTCATCTCCCCCATTGTAACTCCACCCCAGGTC CTTGGAGACCAGCCTGGTGCCACTCTCGGACCCCAAGCTGGCCGTGCTCATCACCAACTCTAATGTCCGC CACTCCCTGGCCTCCAGCGAGTACCCTGTGCGGCGGCGCCAATGTGAAGAAGTGGCCCGGGCGCTGGGCA AGGAAAGCCTCCGGGAGGTACAACTGGAAGAGCTAGAGGGTGAGAACTGCCAGGGTGCTCTATCCTGGAG GCGGCTGTGCTCCCTGCTGGCGCCTCAGTGTGGCCTTGACCCTGCCTGGGACCCCGATCTCCAGGGCCTT CTGCCATGCTCTCCCCAGTCCCTTCAAACACTGCGCACCCAGGGTTCCAATCTCAGCAGGGCTGCTTGAA ATCCTAAAATGGTCTTATCTAATCAGAAAAATCATGTTTCCATTGTGGAAAATGTAGAAAAGTACAAAGT AGAAAATAATAAGCTATAAGGCCACTACCCAGAGATAGCCACTGCTGACATTTTCACGTTTCCTTTCAGT ATTTTTCCACATCTGTCTTCAAAGCTGAGTATATGTAATATATCATCACTTTCCCCCCCCACCCCCTTTT TTTTAAGAGGCAGGGTCTCATTCTGTTGCCCAAGCTGGAGTGTAGTGGTGTGATCATAGCTTACTGCAAA CTTGAACTCTTGAGCTCAAGGGATCCTCCCAGCTCAGCCTTCCAAGTAGCTGAGATTACAGGTGTGCCAC CATGCCCGGCTAATTTTTATCTTTGTAAAGACGGTCTTGCAGTGTTGCCCAGGCTGATCCTGAACTCTGG CCTCAAGTGGTCCTCCTGCCTTGGCCTCCCAAAGTGTTGGGATTATAGGCATGAGCCACTGCGCCCAGCC CATTTGCCGTGTTTTTTTTTTGGACACAGAGTTTCGGTCTTGTCACCCATGCTGGAGTGCAATGGTGCGA TCTCAGCTCACTGTAACCTCTGCCTCCCGGGTTCAAGTGATTCTCCTGCCTCAGCCTCCCGAGTAGCTGG GACTACAGGCGCCCGCCACTACGCCTGGCACATTTTTTATAGTTCTAGTAGAGACTGGGGTTTCACCATG TTGGCCAGGCTGGTCTCAAACGCCTGACCTCAGGTGATCCTCCCGCCTCAGCCTTCCAAAGTGCTGGGAT TACAGGCGTGAGCCATAGTGCCGGTCTCTTTTTTTTTTTTTTTTAAACTAAACATAATCTCAGAACCCAG AACCCTATCTTATCTTATGCCATGAAAGGCATATCTCGGTGTGGCTCTTTTTTTTTTTTTTTCTTTTTTT TTTGGTGAGGTGGAGGCTTGCCCTGTTGCCCAGGCTGGAGTGCAGTGGCGCAATCTCGGCTCACTGCATC CTCCACCTCCTGGGTTCAAATGATTCTCCTGCCTTAGCTTCCTGAGTAGCTGGGATTACTGGCACCCACC ACCACGCCCAGCCAATTTTTATATTTTTAGTAGAGACGGGGTTTCATGTTGGCCAGGCTGGTCTCGAACT CCTGATCTCGTGATCTGCCCGCCTCAGCCTCCCAATGTGCTAGGATTACATGTGTGAGCCACTGCACCTG GCCTCCGTGTGGCTCTTTAAAGCTCCACAATATTTTAGCATTCAGGTGCTCTGTCATTTACTTAACTATT TTCTGATACACCTCACACTGTGATTAACTTTTTTTATTTATCTTTTTTATTATTTATTTATTTATTTATT TGAGACAGAGTCTTGCTCTGTCACCCAGGCTGCAGTGCAGTGGCACGATCTCGGCTCACTGCAACCTCTG CCTCCCAGGTTCAAGTGATTCTCCTGCCTCAGCCTCCTGAGTAGCTAGGATTAGAGGCATGTGCCACCAC ACCTGGCTAATTTTTGTATTTTTAGTAGAGATGAGGTTTTACCATGTTGGTCGGGCTGGTCGTGAACTCC TGACCTGGTGATCTGCCCACCTCAGCCTCCCAAAGTACTGGGATGACAGGCATGAACCACTGTGCCTGGC CATCTTTTTTATTTTTTAAAGAGATGGGTTCTGCTAAGTTGCCCAGGCTGGACCTGAACTCTTGGGCTCA AGTAATCTTCTCACCTAGTCTCCTGGGTAGCTGCAACCAAAGGCACCCGGTTTATCTGCATTCTCTTTTT TTTCTTTGAGACTGAGTCTTGCTCTGTAGCCCAGGCTGGAGCGCAGTGGCGTGATCTCGGCTCACTGCAA CCTCCGTCTTCAGGGTTCAAGCAATTCTCCTGCCTCAGCCTCTGGAGTGGCTGGGACTACAGGCGTGTGC CACCAGAGCGAGTTAATTTTTTTTTTTTTTTGTATTTTTAGTGGACACTGGGTTTCACTATATTGGCCAG GCTGGTCTTGGACTCCTGACCTCAAGTGATCCGCCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCA CAGGCGTGAGCCACTACACCTGGCCTATCTGCATTCTCTTAATAGTTTCTTAGAAATGGATTCTTAGGAG TAGGATTACAGAGTCAAGAGACACAAGTATTGTAGGCTGGGTGCGGTGGCTCACGTCTGTGCCTGTAATC CCAGCACTTTAGGAGGCCAAGGTGGGCAGATTCATTGAGCTCAGGAATTCGAGACCAGCCTGGGCAACAT GGCAAAACCCCATCTCTAAAGAAATACAAAAATTAGCCAGGTGTGGTGGTGTGTGCCTGTAGTCCTAGCT ACTTAGGAGGCTGGGGTGGGAGGATCAATTGAGCCCAGGAGGTTGAGACTGCAGTGAGCTGTGATTGCAC CATGGCACTCCAGCCTGGGCCTCAAAGTGAGATCCTGTCTCCAAAACAAAAAAGATACAAGTATCCTTAA GGCTCCTGCTACACATGGCCAGGAAGGTAGTCTATTGGACAGTTTTAAGGTCATTATCAATATTAGCTCA TTTAATTCCCTCCAAAACTCTGTAAAGCACATTCTGCTACCATAGTTGTCATATTTTTGATGGGGGAATC TACAGTGAGAGGCAGTGCTGGGATCTGAACCCCATCTGGACAGATTAGCTCCAGGGCCCATGCTCTTGAC TGGCTGGCCGTGCTGCCCACACTGAGTTGTTCCTTCCTGGCAGTGTAGGTGTGCCTATCTCAGGGACACT AGACAGCTCCGAGGGACCTCCCTGTCCTTTTCCTTTGTGAACTGTGTCACGTTCTCCAGAGCAGTGCTCA GACCTGCCCTGCCTGCTCTGTGCAGATGCCCTTGGCCAAGGTTTTCACACTGGAACAAGTTGGTCCCTCC TCCCCACCCCAGCCTGTCCTTGCCCCTCCTCCAGGTCTCCTTCTGCATAGGAGCAGCTCACCCTGCCTCC TCCAGAGTCCTGCCCTAGAAGCGCAATCCCTCTCCTTCCATCCCCTGCCTGGCTGCCTGGCTCCTTCCCT CAGCCTCCAAGACATGCTCAGTTTTCTTCCCTCCTAAAACACCACCCACTGTCTCATTTCCATTCATTTC TTTCTTTCTTTCTTTCTTTTTTTTTGAGAGGGAGCCTCACTCTGTCACCCAGGCTGAAGTGCAGTGGCAT GATCTCCACTCACTGCAACCTCCGCCTCCCAGGTTCAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGCT GCGATTACAGGCGCCTGCCACGATGCCCGGCTAACTTTTGTATTTTTAGTAGAGACGGGGTTTCGCCATG TTGGCCAGGCTGGTCTCGAGCTCCTGACCTCAGGCAATCTGCCTGCCTCAGCTTCCCAAAGTGCTGGGAT TACAGGTGTGAGCCACCGCGCCCACCCATTCATTTCTCAGTCCTTTGAATCTACTTGCCCCTCCATCCCG CCATGCCACCTACCCTAACAACCTTCCCCCTTAAACCTGCGGGTTTGGCCGGGCGCAGTACACTGAGTCA GTACTGGTACTGACCCAGGTACCCCTCCAGCCTCAGCTCCAGTCAGATGGGACAGCCTGCTGGTCCCTGG CTGCTTCTGCCCCCTCTTCTGGAGCCCCAGCCCTGGAGGCTCCATGTGGCTCAGCAGAACTTCTTCTCCT CCTGCTCTGTGGTGGCCTCTTGAGGGCAGCACTCACCTTGGAAAGCATGGAGTGTTTCAACCCTCACTGC TCCCTGAAGGACCAAGGTGTCCCATTTTACAGTCGGGGGAGGAGGCACTGTGATAAAGGGGCTCTTCAGA CCCACGTCTGAGAGAGCCAGGCTGCCCTGCCCCCGCGGCCTTCCACCCTTCACCGTCCAGCCAGGGCCAC TGCCATCACCGCCTGCTGGTCCTCACAGGCGTCGGGGCCCCAGGCAGTGAGAAGGCGGCTGCTGACTCCT CTTTCCTCCCCAGCTGCCAGGGACCTGGTGAGCAAAGAGGGCTTCCGGCGGGCCCGGCACGTGGTGGGGG AGATTCGGCGCACGGCCCAGGCAGCGGCCGCCCTGAGACGTGGCGACTACAGAGCCTTTGGCCGCCTCAT GGTGGAGAGCCACCGCTCACTCAGGTGAGGCCCTCTGGGCGCCCCGCTCCTGCCGGGCACAGGCCGGCCC AGGCCCACCCCTTCAATATCCTCTCTGCAGAGACGACTATGAGGTGAGCTGCCCAGAGCTGGACCAGCTG GTGGAGGCTGCGCTTGCTGTGCCTGGGGTTTATGGCAGCCGCATGACGGGCGGTGGCTTCGGTGGCTGCA CGGTGACACTGCTGGAGGCCTCCGCTGCTCCCCACGCCATGCGGCACATCCAGGTGGGCGGGCACCAGGG CCTGGGCGGGCAGGAGCGGCAGCTTCCCGGGGCCCTGCCACTCACCCCCAGCCCGCCTCTTACAGGAGCA CTACGGCGGGACTGCCACCTTCTACCTCTCTCAAGCAGCCGATGGAGCCAAGGTGCTGTGCTTGTGAGGC ACCCCCAGGACAGCACACGGTGAGGGTGCGGGGCCTGCAGGCCAGTCCCACGGCTCTGTGCCCGGTGCCA TCTTCCATATCCGGGTGCTCAATAAACTTGTGCCTCCAATGTGGTACCTGCCTCCTCTAGAGGTGGGTGT ATGCTTGGGTGTCAGAGAA
The GALK1-exon.txt:
1 221 1107 1296 1754 1873 2044 2179 2308 2489 6594 6744 6821 6983 7066 7299
The RNA codon table:
RNA_codon_table = { # Second Base # U C A G # U 'UUU': 'F', 'UCU': 'S', 'UAU': 'Y', 'UGU': 'C', # UxU 'UUC': 'F', 'UCC': 'S', 'UAC': 'Y', 'UGC': 'C', # UxC 'UUA': 'L', 'UCA': 'S', 'UAA': 'Stop', 'UGA': 'Stop', # UxA 'UUG': 'L', 'UCG': 'S', 'UAG': 'Stop', 'UGG': 'W', # UxG # C 'CUU': 'L', 'CCU': 'P', 'CAU': 'H', 'CGU': 'R', # CxU 'CUC': 'L', 'CCC': 'P', 'CAC': 'H', 'CGC': 'R', # CxC 'CUA': 'L', 'CCA': 'P', 'CAA': 'Q', 'CGA': 'R', # CxA 'CUG': 'L', 'CCG': 'P', 'CAG': 'Q', 'CGG': 'R', # CxG # A 'AUU': 'I', 'ACU': 'T', 'AAU': 'N', 'AGU': 'S', # AxU 'AUC': 'I', 'ACC': 'T', 'AAC': 'N', 'AGC': 'S', # AxC 'AUA': 'I', 'ACA': 'T', 'AAA': 'K', 'AGA': 'R', # AxA 'AUG': 'M', 'ACG': 'T', 'AAG': 'K', 'AGG': 'R', # AxG # G 'GUU': 'V', 'GCU': 'A', 'GAU': 'D', 'GGU': 'G', # GxU 'GUC': 'V', 'GCC': 'A', 'GAC': 'D', 'GGC': 'G', # GxC 'GUA': 'V', 'GCA': 'A', 'GAA': 'E', 'GGA': 'G', # GxA 'GUG': 'V', 'GCG': 'A', 'GAG': 'E', 'GGG': 'G' # GxG }
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started