Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Code file for M269 20J TMA01 Question 3. Student version 4: 24/03/20 The following table determines which amino acid is produced by

""" Code file for M269 20J TMA01 Question 3. Student version 4: 24/03/20 """ """ The following table determines which amino acid is produced by a particular 3-base DNA codon. It uses a Python structure called a dictionary, which we will meet later in the module, so you do not need to understand at this stage how it works. Do NOT change the table. """ dnaCode = { 'TTT':'Phe', 'TTC':'Phe','TTA':'Phe','TTG':'Phe', 'TCT':'Ser', 'TCC':'Ser','TCA':'Ser','TCG':'Ser', 'TAT':'Tyr', 'TAC':'Tyr','TAA':'Stop','TAG':'Stop', 'TGT':'Cys', 'TGC':'Cys','TGA':'Stop','TGG':'Trp', 'CTT':'Leu', 'CTC':'Leu','CTA':'Leu','CTG':'Leu', 'CCT':'Pro', 'CCC':'Pro','CCA':'Pro','CCG':'Pro', 'CAT':'His', 'CAC':'His','CAA':'Gin','CAG':'Gin', 'CGT':'Arg', 'CGC':'Arg','CGA':'Arg','CGG':'Arg', 'ATT':'Ile', 'ATC':'Ile','ATA':'Ile','ATG':'Met', 'ACT':'Thr', 'ACC':'Thr','ACA':'Thr','ACG':'Thr', 'AAT':'Asn', 'AAC':'Asn','AAA':'Lys','AAG':'Lys', 'AGT':'Ser', 'AGC':'Ser','AGA':'Arg','AGG':'Arg', 'GTT':'Val', 'GTC':'Val','GTA':'Val','GTG':'Val', 'GCT':'Ala', 'GCC':'Ala','GCA':'Ala','GCG':'Ala', 'GAT':'Asp', 'GAC':'Asp','GAA':'Glu','GAG':'Glu', 'GGT':'Gly', 'GGC':'Gly','GGA':'Gly','GGG':'Gly' } # Question 3(b) # ------------- def percentBases(dnaStrand): """ Return a 4-tuple with the percentage of each base C, G, A and T in a DNA strand. You can assume dnaStrand is a string with only those four characters. Use the python round() function to round percentage answers to 2 d.p. e.g. this would round a raw percentage of 33.3333.... to 33.33 exactly. rawPerCent = 100/3 percentC = round(rawPerCent, 2) The return statement has been done for you. """ pass # replace this by your code and then uncomment the next line # return (percentC, percentG, percentA, percentT) # Question 3(e) # ------------- def aminoAcid(dnaCodon): """ Return the abbreviated amino acid name corresponding to a 3-base DNA codon. For example, if dnaCodon is 'GTT', the function will return 'Val'. Do NOT change this function. Call it from function 'translateGene'. """ return(dnaCode[dnaCodon]) def translateGene(dnaStrand, start, stop): """ Return the protein (sequence of amino acids) obtained by translating the gene from the start to the stop indices in dnaStrand. Note that the start codon generates a corresponding amino acid (as well as indicating the start of the gene) but the stop codon is just a marker and does not generate anything. """ protein = [] pass # complete this function by replacing this with your code return protein def findCodon(dnaStrand, startPos, dnaCodon): """ Return the position in dnaStrand of the dnaCodon consisting of 3 bases, searching from a given start position (a string index). Return -1 if the codon is not found. Do NOT change this function. It's used to test Question 3(e) and to implement Question 3(f) """ if startPos >= 0: # check for invalid start positions for i in range(startPos, len(dnaStrand)-2): if dnaStrand[i:i+3] == dnaCodon: # compare a 3-base DNA slice to the codon return i return -1 # Question 3(f) # ------------- def translateStrand(dnaStrand): """ Translate all genes in a DNA strand, returning a list of proteins, each protein being a list of amino acid names for one gene. You may wish to use the provided function 'findCodon' and the function 'translateGene' that you wrote for Question 3(e). """ proteinList = [] pass # complete this function by replacing this with your code return proteinList 

DNA molecules

The DNA molecule is made up of two linkedstrandswhich together form a helix. The links between the strands are provided by bonds between pairs of chemicals calledbases. There are four such bases, named cytosine, guanine, adenine and thymine, which we will refer to by their initial letters C, G, A and T, respectively. Each base bonds to a base in the opposite strand, where C and G bond with each other, and A and T also bond with each other. No other bonds between bases occur. Seehttps://www.yourgenome.org/facts/what-is-dnafor more detail, if interested.

  • a.Given a single strand of DNA represented by a string of characters, each corresponding to one of the four bases, e.g.
  • 'CGGTACAATCGATTTAGAG',
  • write an initial insight for how you would calculate the percentage of each of the bases (C, G, A and T) in a DNA strand. For example, there's 3/19 15.7% of cytosine in the strand above. These percentages can be of interest to biologists investigating the DNA of organisms.
  • (4 marks)
  • b.Write a Python program to implement your insight from part (a), by completing the functionpercentBasesin fileTMA03_Q1.py. You can assume that only characters corresponding to the standard DNA bases (C, G, A and T) will occur in the input strings. Round the results to two decimal places, following the example in the code file.
  • (4 marks)
  • c.What is the complexity of your functionpercentBasesfrom part (b), in terms ofT(n)and Big-O notation, wherenis the number of bases in the strand? Take the assignment statement as the basic unit of computation. Ignore the rounding operations for the purposes of this analysis.
  • (4 marks)

The Genetic Code

The pattern of bases in DNA molecules forms a code (the Genetic Code) for making the proteins that are the basis of life on Earth. A simplified version of how this works is that groups of three bases (known as acodon) in thegenesthat form part of the DNA sequence each specify the production of oneamino acid, and all the amino acids specified by a gene link together to form a protein molecule. There are 20 amino acids that normally occur in living organisms, and so the Genetic Code is said to have redundancy - there are many more possible codon patterns than there are amino acids, so typically several different codons will correspond to the same amino acid. Amino acids have names like Leucine, Tyrosine, etc., and we will use the standard abbreviations for these, such as 'Leu', 'Tyr', etc.

The start of a gene in a DNA strand is indicated by a specialstart codon, 'ATG', and the end of the gene is marked by a subsequentstop codonfor which there are several possible patterns but our examples will only use 'TAG' for stop codons. Start codons generate an amino acid for the protein as well as marking the start of the gene. Stop codons do not add an amino acid to the protein and act only as a marker.

Actual human genes have typically 27 000 bases in each strand, and in some cases many more, but we will use examples with far fewer bases to demonstrate the principles. A DNA strand will typically contain many genes. A strand may have codons before the first start codon, in between its genes, and after the last stop codon, which do not give rise to any amino acids - these are called non-coding sequences.

  • d.Given a single strand of DNA, represented by a string of characters, plus the locations of a start codon and the first subsequent stop codon, write an initial insight to translate the DNA codons of that gene into the sequence of amino acids in the corresponding protein. For example, for the DNA strand 'GGGATGCTTTAG', with a start codon beginning at location 3 and a stop codon beginning at location 9, the algorithm would produce a sequence with the two amino acids corresponding to codons 'ATG' and 'CTT'. Assume there is a table where the algorithm can look up the amino acid for a given codon. Remember that stop codons are just a marker and do not generate an amino acid.
  • (4 marks)
  • e.Complete the Python functiontranslateGenein fileTMA03_Q1.pyto implement your insight from part (d). We have provided you with an easy way to use the Genetic Code in the form of a Python function calledaminoAcid, that gives the abbreviated name of the amino acid corresponding to any given codon.
  • To help in testing this function, we have provided you with a complete function calledfindCodonwhich can search a DNA strand for the location of a codon. See how it's used in the provided test file.
  • (3 marks)
  • f.Using your function from part (e), complete the functiontranslateStrand, so that it can process a DNA strand containing any number of genes (marked by start and stop codons) to produce the sequence of amino acids corresponding to each gene.
  • In implementing this function, you may also wish to make use of the functionfindCodonmentioned in part (e).
  • (6 marks)
  • oo

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Financial management theory and practice

Authors: Eugene F. Brigham and Michael C. Ehrhardt

12th Edition

978-0030243998, 30243998, 324422695, 978-0324422696

Students also viewed these Programming questions

Question

1. Socialization policy in mass media?

Answered: 1 week ago

Question

1. What is employment? 2. What is the rewards for employment?

Answered: 1 week ago

Question

1. What is meant by Landslide? 2.The highest peak in Land?

Answered: 1 week ago