Question
xercise 3 # The gene GALK1 in human encodes the enzyme galactokinase 1. This enzyme enables the body to process a simple sugar called galactose,
xercise 3 # The gene GALK1 in human encodes the enzyme galactokinase 1. This enzyme enables the body to process a simple sugar called galactose, which is present in small amounts in many foods. Galactose is primarily part of a larger sugar called lactose, which is found in all dairy products and many baby formulas. # # Supposed you have already the file GALK1.txt which contains the DNA sequence of the gene GALK1 in your current working directory of your computer. # # 1) First, open the file and create an object that contains just the DNA sequence of the GALK1 gene (without spaces and line breaks in the middle of the sequence).
# We have another file (`GALK1-exon.txt`) that contains the position of all exons of the gene GALK1 in text format. Line breaks (` `) separate the position of different exons, while tabs (`\t`) separate the starting from the ending position of an exon. # # 2) From this file, creating a list that contain `(start, end)` tuples representing the positions of all exons of the gene GALK1.
# The exons of a gene contains the sequence that will be transcribed into a mRNA, thus only the DNA sequence in the exon will be transcribed while other part is not. The DNA sequence presenting here is the sequence of the coding strand for mRNA.
~~~~~~~~~~End of questions~~~~~~~~~~~~~~~~
The GALK1.txt:
>NG_008079.2:5008-12306 Homo sapiens galactokinase 1 (GALK1), RefSeqGene on chromosome 17 ATCCCGCGCCGACGGGGCTGTGCCGGAGCAGCTGTGCAGAGCTGCAGGCGCGCGTCATGGCTGCTTTGAG ACAGCCCCAGGTCGCGGAGCTGCTGGCCGAGGCCCGGCGAGCCTTCCGGGAGGAGTTCGGGGCCGAGCCC GAGCTGGCCGTGTCAGCGCCGGGCCGCGTCAACCTCATCGGGGAACACACGGACTACAACCAGGGCCTGG TGCTGCCTATGGTGAGGGGCTGCACGGGGAGCCCCTAGCCCGCCGCCGCCTGTCCCGGCCGCCGAGGAGG GCGGGCCTCGGGGACGCTGGGGGCGAGTTCTTCCCGCGGGAGATGTGGGGCGGGCAGCTGCGCCTGGAGC ACCGGTGCACGGAAGAGTCCCCGGGACAGGCTGTTCCCCACGTTGGAAGGGAGGAAGCGAGGAAGTGGCC GGGAGAGGGTGCGCGGCCGCCTCTTGGCTCAAGCCCGCCCTCTGGGGGCTGGGGCTCCTCGCCTTCAACC TGGGAGCATGTTCCCCTTAAACTGTGAGGCCCTGTGTGCCACGCAGAAGGGGACACTCCGCGCCTCCGGC CACCGTGGGGCCCCAACCGCAGACCTGGGCGAACGTAGCCTTCTGGCCCAGCCCGTTCAATTTACAGAGG AGGAAACTGAGGCCTAGAGAGGCCCAGTGAACTGCTGGAGGTCACACAGCAGGTTCTTGGCGGGGCTGCG ACTTGGGAGTGAGGACTCCCAGCTTTCAGCGGGGGGCGCTTTCCGCCCCATCTGCAGCTTGGGGAGTGCA CAGGTACAGGATGTCCAGAGCCACCCCAAAATGTAAAGGCTTTGGAGCTCCAGTGATCTGTTTTCCCTTT GGGCTAGCTCTCCCCCTTGCCCCACAGCTCAGGGCAGAGTCCAGGTCTGTGCTCCAGCTGCAGCCGCCCC GCCCCTGAAGACCTAAGGGGGCAGGGCTCAAGCCCCCAAGGTCAGCTGGCCCTCAGGATCTTCCCTGCGA CGCTGAACCTGGAGGTTCAGAACCTGATGACTGTGGAGGCATCAGAACCTCGGCTGGAGGCAGTGTCATT GGAGAGGCTTACTCCAGCTGGCGGAAGCCTCACGTACTGCTTGTCTCTCCTGCCAGGCTCTGGAGCTCAT GACGGTGCTGGTGGGCAGCCCCCGCAAGGATGGGCTGGTGTCTCTCCTCACCACCTCTGAGGGTGCCGAT GAGCCCCAGCGGCTGCAGTTTCCACTGCCCACAGCCCAGCGCTCGCTGGAGCCTGGGACTCCTCGGTGGG CCAACTATGTCAAGGGAGTGATTCAGTACTACCCAGGTATGGGGCCCAGGCCTGAGCCAAGTCCTCACTG ATACTAGGAGTGCCACCTCACAGCCACAGAGCCCATTCATTTGTCTGATACACTGTGGGGAAGGCTTGTA GAGTGGAGCATCCCATTGTACAGATGAGGAAACTGATGCCCCCAGAAGGTCGGGAACTTGCCCTGGGTTT CCCGTGACCTGATTGGAGGAGCCAGGATTTGAACCCCAGCCTTTTTTCCCTCCAGAGCCCTAAACCAGGA GGACAATTAGAAGTGTCCCAGCAACCTCAGAGGGTGGGAAAATGGAGGGCAGTGGGTCCCTTGGCCCAGC AGGTTGGTGGCTTCTGACAATTGAGACACACACCCTAGAAACAGCTGCTAGGCCGTTGCTGCCCTTCCCG CCAGGACACCTGCCCTTCCTGTGCCATCCTCCCAGGCAGCCCCTCTTACCATCACCTGTTCTTTCCCCTG CAGCTGCCCCCCTCCCTGGCTTCAGTGCAGTGGTGGTCAGCTCAGTGCCCCTGGGGGGTGGCCTGTCCAG CTCAGCATCCTTGGAAGTGGCCACGTACACCTTCCTCCAGCAGCTCTGTCCAGGTACCAGCTAGGCCCCA GCCCTGACCCAGCCCTCCTTCCCTGAGGTCTCCAGGTGGTCCCAGCTTCTACTATGCCTTATGGAGGGGG TGGCAGGGACTCTCCCTGGAGTGTCATTGAAGCCACTGCTGCTTCCACCAGCCCTAGCCTCCCCACCTCA CCCTGTACTGCAGACTCGGGCACAATAGCTGCCCGCGCCCAGGTGTGTCAGCAGGCCGAGCACAGCTTCG CAGGGATGCCCTGTGGCATCATGGACCAGTTCATCTCACTTATGGGACAGAAAGGCCACGCGCTGCTCAT TGACTGCAGGTTGGGCTCGCTCCCCTCGTCCCCTCCCGCCCTGCACTCAGCAGCTCCTGGGTGGGAGTGT GCCCACTGCCTGGCGCAGCAAGCACACGCTTGGCCTCGTCATCTCCCCCATTGTAACTCCACCCCAGGTC CTTGGAGACCAGCCTGGTGCCACTCTCGGACCCCAAGCTGGCCGTGCTCATCACCAACTCTAATGTCCGC CACTCCCTGGCCTCCAGCGAGTACCCTGTGCGGCGGCGCCAATGTGAAGAAGTGGCCCGGGCGCTGGGCA AGGAAAGCCTCCGGGAGGTACAACTGGAAGAGCTAGAGGGTGAGAACTGCCAGGGTGCTCTATCCTGGAG GCGGCTGTGCTCCCTGCTGGCGCCTCAGTGTGGCCTTGACCCTGCCTGGGACCCCGATCTCCAGGGCCTT CTGCCATGCTCTCCCCAGTCCCTTCAAACACTGCGCACCCAGGGTTCCAATCTCAGCAGGGCTGCTTGAA ATCCTAAAATGGTCTTATCTAATCAGAAAAATCATGTTTCCATTGTGGAAAATGTAGAAAAGTACAAAGT AGAAAATAATAAGCTATAAGGCCACTACCCAGAGATAGCCACTGCTGACATTTTCACGTTTCCTTTCAGT ATTTTTCCACATCTGTCTTCAAAGCTGAGTATATGTAATATATCATCACTTTCCCCCCCCACCCCCTTTT TTTTAAGAGGCAGGGTCTCATTCTGTTGCCCAAGCTGGAGTGTAGTGGTGTGATCATAGCTTACTGCAAA CTTGAACTCTTGAGCTCAAGGGATCCTCCCAGCTCAGCCTTCCAAGTAGCTGAGATTACAGGTGTGCCAC CATGCCCGGCTAATTTTTATCTTTGTAAAGACGGTCTTGCAGTGTTGCCCAGGCTGATCCTGAACTCTGG CCTCAAGTGGTCCTCCTGCCTTGGCCTCCCAAAGTGTTGGGATTATAGGCATGAGCCACTGCGCCCAGCC CATTTGCCGTGTTTTTTTTTTGGACACAGAGTTTCGGTCTTGTCACCCATGCTGGAGTGCAATGGTGCGA TCTCAGCTCACTGTAACCTCTGCCTCCCGGGTTCAAGTGATTCTCCTGCCTCAGCCTCCCGAGTAGCTGG GACTACAGGCGCCCGCCACTACGCCTGGCACATTTTTTATAGTTCTAGTAGAGACTGGGGTTTCACCATG TTGGCCAGGCTGGTCTCAAACGCCTGACCTCAGGTGATCCTCCCGCCTCAGCCTTCCAAAGTGCTGGGAT TACAGGCGTGAGCCATAGTGCCGGTCTCTTTTTTTTTTTTTTTTAAACTAAACATAATCTCAGAACCCAG AACCCTATCTTATCTTATGCCATGAAAGGCATATCTCGGTGTGGCTCTTTTTTTTTTTTTTTCTTTTTTT TTTGGTGAGGTGGAGGCTTGCCCTGTTGCCCAGGCTGGAGTGCAGTGGCGCAATCTCGGCTCACTGCATC CTCCACCTCCTGGGTTCAAATGATTCTCCTGCCTTAGCTTCCTGAGTAGCTGGGATTACTGGCACCCACC ACCACGCCCAGCCAATTTTTATATTTTTAGTAGAGACGGGGTTTCATGTTGGCCAGGCTGGTCTCGAACT CCTGATCTCGTGATCTGCCCGCCTCAGCCTCCCAATGTGCTAGGATTACATGTGTGAGCCACTGCACCTG GCCTCCGTGTGGCTCTTTAAAGCTCCACAATATTTTAGCATTCAGGTGCTCTGTCATTTACTTAACTATT TTCTGATACACCTCACACTGTGATTAACTTTTTTTATTTATCTTTTTTATTATTTATTTATTTATTTATT TGAGACAGAGTCTTGCTCTGTCACCCAGGCTGCAGTGCAGTGGCACGATCTCGGCTCACTGCAACCTCTG CCTCCCAGGTTCAAGTGATTCTCCTGCCTCAGCCTCCTGAGTAGCTAGGATTAGAGGCATGTGCCACCAC ACCTGGCTAATTTTTGTATTTTTAGTAGAGATGAGGTTTTACCATGTTGGTCGGGCTGGTCGTGAACTCC TGACCTGGTGATCTGCCCACCTCAGCCTCCCAAAGTACTGGGATGACAGGCATGAACCACTGTGCCTGGC CATCTTTTTTATTTTTTAAAGAGATGGGTTCTGCTAAGTTGCCCAGGCTGGACCTGAACTCTTGGGCTCA AGTAATCTTCTCACCTAGTCTCCTGGGTAGCTGCAACCAAAGGCACCCGGTTTATCTGCATTCTCTTTTT TTTCTTTGAGACTGAGTCTTGCTCTGTAGCCCAGGCTGGAGCGCAGTGGCGTGATCTCGGCTCACTGCAA CCTCCGTCTTCAGGGTTCAAGCAATTCTCCTGCCTCAGCCTCTGGAGTGGCTGGGACTACAGGCGTGTGC CACCAGAGCGAGTTAATTTTTTTTTTTTTTTGTATTTTTAGTGGACACTGGGTTTCACTATATTGGCCAG GCTGGTCTTGGACTCCTGACCTCAAGTGATCCGCCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCA CAGGCGTGAGCCACTACACCTGGCCTATCTGCATTCTCTTAATAGTTTCTTAGAAATGGATTCTTAGGAG TAGGATTACAGAGTCAAGAGACACAAGTATTGTAGGCTGGGTGCGGTGGCTCACGTCTGTGCCTGTAATC CCAGCACTTTAGGAGGCCAAGGTGGGCAGATTCATTGAGCTCAGGAATTCGAGACCAGCCTGGGCAACAT GGCAAAACCCCATCTCTAAAGAAATACAAAAATTAGCCAGGTGTGGTGGTGTGTGCCTGTAGTCCTAGCT ACTTAGGAGGCTGGGGTGGGAGGATCAATTGAGCCCAGGAGGTTGAGACTGCAGTGAGCTGTGATTGCAC CATGGCACTCCAGCCTGGGCCTCAAAGTGAGATCCTGTCTCCAAAACAAAAAAGATACAAGTATCCTTAA GGCTCCTGCTACACATGGCCAGGAAGGTAGTCTATTGGACAGTTTTAAGGTCATTATCAATATTAGCTCA TTTAATTCCCTCCAAAACTCTGTAAAGCACATTCTGCTACCATAGTTGTCATATTTTTGATGGGGGAATC TACAGTGAGAGGCAGTGCTGGGATCTGAACCCCATCTGGACAGATTAGCTCCAGGGCCCATGCTCTTGAC TGGCTGGCCGTGCTGCCCACACTGAGTTGTTCCTTCCTGGCAGTGTAGGTGTGCCTATCTCAGGGACACT AGACAGCTCCGAGGGACCTCCCTGTCCTTTTCCTTTGTGAACTGTGTCACGTTCTCCAGAGCAGTGCTCA GACCTGCCCTGCCTGCTCTGTGCAGATGCCCTTGGCCAAGGTTTTCACACTGGAACAAGTTGGTCCCTCC TCCCCACCCCAGCCTGTCCTTGCCCCTCCTCCAGGTCTCCTTCTGCATAGGAGCAGCTCACCCTGCCTCC TCCAGAGTCCTGCCCTAGAAGCGCAATCCCTCTCCTTCCATCCCCTGCCTGGCTGCCTGGCTCCTTCCCT CAGCCTCCAAGACATGCTCAGTTTTCTTCCCTCCTAAAACACCACCCACTGTCTCATTTCCATTCATTTC TTTCTTTCTTTCTTTCTTTTTTTTTGAGAGGGAGCCTCACTCTGTCACCCAGGCTGAAGTGCAGTGGCAT GATCTCCACTCACTGCAACCTCCGCCTCCCAGGTTCAAGCAATTCTCCTGCCTCAGCCTCCTGAGTAGCT GCGATTACAGGCGCCTGCCACGATGCCCGGCTAACTTTTGTATTTTTAGTAGAGACGGGGTTTCGCCATG TTGGCCAGGCTGGTCTCGAGCTCCTGACCTCAGGCAATCTGCCTGCCTCAGCTTCCCAAAGTGCTGGGAT TACAGGTGTGAGCCACCGCGCCCACCCATTCATTTCTCAGTCCTTTGAATCTACTTGCCCCTCCATCCCG CCATGCCACCTACCCTAACAACCTTCCCCCTTAAACCTGCGGGTTTGGCCGGGCGCAGTACACTGAGTCA GTACTGGTACTGACCCAGGTACCCCTCCAGCCTCAGCTCCAGTCAGATGGGACAGCCTGCTGGTCCCTGG CTGCTTCTGCCCCCTCTTCTGGAGCCCCAGCCCTGGAGGCTCCATGTGGCTCAGCAGAACTTCTTCTCCT CCTGCTCTGTGGTGGCCTCTTGAGGGCAGCACTCACCTTGGAAAGCATGGAGTGTTTCAACCCTCACTGC TCCCTGAAGGACCAAGGTGTCCCATTTTACAGTCGGGGGAGGAGGCACTGTGATAAAGGGGCTCTTCAGA CCCACGTCTGAGAGAGCCAGGCTGCCCTGCCCCCGCGGCCTTCCACCCTTCACCGTCCAGCCAGGGCCAC TGCCATCACCGCCTGCTGGTCCTCACAGGCGTCGGGGCCCCAGGCAGTGAGAAGGCGGCTGCTGACTCCT CTTTCCTCCCCAGCTGCCAGGGACCTGGTGAGCAAAGAGGGCTTCCGGCGGGCCCGGCACGTGGTGGGGG AGATTCGGCGCACGGCCCAGGCAGCGGCCGCCCTGAGACGTGGCGACTACAGAGCCTTTGGCCGCCTCAT GGTGGAGAGCCACCGCTCACTCAGGTGAGGCCCTCTGGGCGCCCCGCTCCTGCCGGGCACAGGCCGGCCC AGGCCCACCCCTTCAATATCCTCTCTGCAGAGACGACTATGAGGTGAGCTGCCCAGAGCTGGACCAGCTG GTGGAGGCTGCGCTTGCTGTGCCTGGGGTTTATGGCAGCCGCATGACGGGCGGTGGCTTCGGTGGCTGCA CGGTGACACTGCTGGAGGCCTCCGCTGCTCCCCACGCCATGCGGCACATCCAGGTGGGCGGGCACCAGGG CCTGGGCGGGCAGGAGCGGCAGCTTCCCGGGGCCCTGCCACTCACCCCCAGCCCGCCTCTTACAGGAGCA CTACGGCGGGACTGCCACCTTCTACCTCTCTCAAGCAGCCGATGGAGCCAAGGTGCTGTGCTTGTGAGGC ACCCCCAGGACAGCACACGGTGAGGGTGCGGGGCCTGCAGGCCAGTCCCACGGCTCTGTGCCCGGTGCCA TCTTCCATATCCGGGTGCTCAATAAACTTGTGCCTCCAATGTGGTACCTGCCTCCTCTAGAGGTGGGTGT ATGCTTGGGTGTCAGAGAA
The GALK1-exon.txt:
1 221 1107 1296 1754 1873 2044 2179 2308 2489 6594 6744 6821 6983 7066 7299
The RNA codon table:
RNA_codon_table = { # Second Base # U C A G # U 'UUU': 'F', 'UCU': 'S', 'UAU': 'Y', 'UGU': 'C', # UxU 'UUC': 'F', 'UCC': 'S', 'UAC': 'Y', 'UGC': 'C', # UxC 'UUA': 'L', 'UCA': 'S', 'UAA': 'Stop', 'UGA': 'Stop', # UxA 'UUG': 'L', 'UCG': 'S', 'UAG': 'Stop', 'UGG': 'W', # UxG # C 'CUU': 'L', 'CCU': 'P', 'CAU': 'H', 'CGU': 'R', # CxU 'CUC': 'L', 'CCC': 'P', 'CAC': 'H', 'CGC': 'R', # CxC 'CUA': 'L', 'CCA': 'P', 'CAA': 'Q', 'CGA': 'R', # CxA 'CUG': 'L', 'CCG': 'P', 'CAG': 'Q', 'CGG': 'R', # CxG # A 'AUU': 'I', 'ACU': 'T', 'AAU': 'N', 'AGU': 'S', # AxU 'AUC': 'I', 'ACC': 'T', 'AAC': 'N', 'AGC': 'S', # AxC 'AUA': 'I', 'ACA': 'T', 'AAA': 'K', 'AGA': 'R', # AxA 'AUG': 'M', 'ACG': 'T', 'AAG': 'K', 'AGG': 'R', # AxG # G 'GUU': 'V', 'GCU': 'A', 'GAU': 'D', 'GGU': 'G', # GxU 'GUC': 'V', 'GCC': 'A', 'GAC': 'D', 'GGC': 'G', # GxC 'GUA': 'V', 'GCA': 'A', 'GAA': 'E', 'GGA': 'G', # GxA 'GUG': 'V', 'GCG': 'A', 'GAG': 'E', 'GGG': 'G' # GxG }
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started