Question

1 Approved Answer

Posted on Sep 10, 2024

In Jupyter Notebook In [ my_dnaGGGGCCCCAAAAAAAATTTATATAT replacementl my_dna.replace('A', 't' print (replacementl) replacement2-replacementi.replace( T', 'a) print(replacement2) replacement3-replacement2.replace('C, '8) print (replacement3) replacement4 - replacement3.replace('G', 'c print (replacement4)

In Jupyter Notebook

image text in transcribed

In [ my_dnaGGGGCCCCAAAAAAAATTTATATAT" replacementl my_dna.replace('A', 't' print (replacementl) replacement2-replacementi.replace( T', 'a) print(replacement2) replacement3-replacement2.replace('C, '8) print (replacement3) replacement4 - replacement3.replace('G', 'c print (replacement4) Complementary_strand-replacement4.upper) print (Complementary_strand) Exercise Using at lease two additinal methods to find reverse_complementary strand of GGGGCCCCAAAAAAAATTTATATAT 1. functions: Join(reversed)) 2. indexing: :-1] In [ ] : #reverse-complementary #print #RC strand (RC) #print (RC) Caculate the GC-content GC content is usually calculated as a percentage value and sometimes called G+C ratio or GC-ratio. GC-content percentage is calculated as Count(G+ C)/Count(A + T + G + C) * 100%. In my_dna"ACTGATCGATTACGTATAGTATTTGCTATCATACATATATATCGATGCGTTCAT" length len (my_dna) c-count = my-dna. count('C') G_count my_dna.count'G) print("length: " str(length)) print("C count: "str(a_count)) print("G count: "str(t_count)) print("GC content: "str((G_count+C_count)/length)) FASTA files to record the DNA sequences (NBPF1.fasta) >hg38_knownGene_uc001ayw.5 range-chr1:16563943-16592021 5'pad-0 3'pad-0 strand-repeatMasking none ATGGTGGTATCAGCTGGCCCTTGGTCCAGCGAGAAGGCAGAGACGAACAT TTTAGAAATCAACGAGAAATTGCGCCCCCAGCTGGCAGAGAACAAACAGC AGTTCAGAAACCTCAAAGAGAAATGTTTTGTAACTCAACTGGCCGGCTTC CTGGCCAACCGACAGAAGAAATACAgtaagatctataggctcaccatcat gaaagtgatgaatgatgtcctgtcttctctctgagacactaaatgctctc tccatcaaaaataattcatccttcctgtacttctaggaaaacagaaatg ggtatttaacatttgttaaagttggaagacagaggtaccaaagtattt agcaactttccatgtttgcaatcaggtgggggtgggactagagttaaact Read FASTA files In J: dna_file open("NBPF1.fasta" my_dna dna_file.read) print (my_dna[0:10]) Exercise Extract multiple exons from the fasta file (NBPF1.fasta) Exon1: Exon2: Exon3: In : Join all the exons into a single sequence In joined_exons-".join(c for c in my_dna if c.isupper()) print (joined_exons [1:300]) Transcribe DNA Genes into mRNA GTGCATCTCACTCCTGAGGAGAAG. CACGTAGACTGAGGACTCCTCTTC (transcription) RNA (translation) ..GU UGCAUCUGACuecu GAGGAGAAG protein Method 1: rely on loop, replace every "T" with "U" Method 2: use replace ) function Method 3: define a transcribe function, so you can transcribe the DNA easily Exercise using the loop, replace every "T with "U In [ ] : _ rna-joined-exons fori in joined_exons: # Replace all occurrences of T with U # Print the RNA string print ("RNA: ", rna) In [ ]: H # Method 2: use replace () function Method 3: define a transcribe function, so you can transcribe the DNA easily print('RNA:, rna) rna -joined exons.replace('T' 'U) print ('RNA: ,rna) In [J def transcribe(sequence): rna_seq rna-seq sequence . replace('T', return(rna_seq) 'U') rna-transcribe (joined_exons) rna Translating DNA Genes into Proteins use dictionaries Condon amino acid Creat a dictionary to contain the genetic code: condon to amino acid "ACA": "T", "ACC":"T", "ACG""T", "ACU": "T", "AGA": "R", "AGC":"S", "AGG""R", "AGU":"S", "AUA": "I", "AUC": "I", "AUG": "M", "AUU": "I" "CAA":"Q", "CAC": "H", "CAG"Q "CAU":"H", "CGA": "R", "CGC": "R", "CGG""R", "CGU":"R" "CUA": "L", "CUC":"L", "CUG":"L", "CUU";"L", "GAA": "E", "GAC": "D", "GAG":"E" "GAU":"D", "GCA" "A", "GCC: "A", "GCG":"A", "GCU";"A", "GUA": "V", "GUC": "V", "GUG":"V", "GUU":"V", "UAA":"-" , "UAC": "Y", "UAG":"-" , "UAU", "T", "UCA": "S", "UCC" : "S", "UCG" : "S", "UCU":"S", "UGA":"_", "UGC":"C", "UGG":"W", "UGU":"C", In [ ]: codon-table ("AAA": "", "":"N", "AAG":"", "AAU" : "N", "ACA'' : "T", "ACC":"T", "ACG":"T", "ACU" : "T", "AGA" "R" "AGC"S", "AGG" "R", "AGU": "S", "AUA": "I", "AUC""I", "AUG": "M", "AUU" : "I", "CAA"; "Q", "CAC":"H", "CAG":"0", "CAU":"H", "CGA" "R", "CGC":"R", CGG "R" "CGU":"R", "CUA": "L" "CUC";"L", "CUG": "L" "cuu":"L", "GAA": "E" "GAC" : "D", "GAG": "E", "GAU" : "D", "GCA": "A", "GCC": "A", "GCG "A", "GCU": "A", "GUA":"V", "GUC":"V", "GUG":"V", "GUU":"V", "UAA""", "UCA" :"S'' "UGA": "" "UAC" : "Y", "UAG":"_", "UCC":"S", "UCG" : "S", "UGC": "C""UGG":"W" "UAU":"T", "UCU" : "S", "UGU":"C" -, , , , In [ ]: H #print the firs condon and find out which amino acid it represents codon1 rna[0:3] codon2 rna[3:6] codon3 na[6:9] print (codon1) print (codon2) print (codon3) print (rna [0:9]) In [ ]; #get the amino acid aa-codon_table.get (codon1) in [ ]; H #write a loop to translate every codon to amino acid protein-seq- for n in range (, len(rna), 3): in [ ]: H #write a loop to translate every codon to amino acid protein_seq'" for n in range(, len(rna), 3): protein_seqcodon_table[rna[n n+3] protein_seq Exercise: Convert above to a translate_rna function In def translate_rna(sequence): return protein_seq print (rna) #caLL the function protein-translate_rna(rna) protein In