Question
Problem 2 (25 points): Open reading frame finder. An Opending Reading Frame (ORF) is a continuous stretch of codons (nu- cleotide triplets) that contain a
-
Problem 2 (25 points): Open reading frame finder.
An Opending Reading Frame (ORF) is a continuous stretch of codons (nu- cleotide triplets) that contain a start codon (i.e., ATG) at the beginning and a stop codon (i.e., TAA, TAG or TGA) at the end only, i.e., with no stop codon in the middle (https://en.wikipedia.org/wiki/Open reading frame). Note that there are three different ways (frames) that you can convert a DNA sequence into triplets, each shifting one nucleotide from another.
1
Problem 2.1 (5 points): Reverse complementary strand.
Write a MATLAB function getReverseComp that takes a string as a DNA se- quence and return its reverse complementary strand. Save as file getReverseC- omp.m.
Test it on the command line by typing in: getReverseComp(ACGTGCA) or run the corresponding cell in hw1q2script.m.
Problem 2.2 (2 points): Identify possible start codons.
Write a MATLAB function findStartCodon that takes a string as a DNA se- quence, and returns the indices of all possible start codons. You may need the function strfind (type help strfind on matlab command line for usage.) Save as file findStartCodon.m.
Test it on the command line by typing in: findStartCodon(AATGTATGA) or run the corresponding cell in hw1q2script.m.
Problem 2.3 (3 points): Identify possible stop codons.
Write a MATLAB function findStopCodon that takes a string as a DNA se- quence, and returns the indices of all possible stop codons. The indices should be sorted. Save as file findStopCodon.m.
Test it on the command line by typing in: findStartCodon(ATAAGTAGGA) or run the corresponding cell in hw1q2script.m.
Problem 2.4 (15 points) Identify the longest open reading frames(ORF)
Write a MATLAB function that takes as input a DNA sequence, and returns the longest ORF, which is described as two numbers (the start index of the start codon, and the END index of the stop codon). Save as file findLongestORF.m.
To test your function on the command line, type in: findLongestORF(GGAGGCGTAAAATGCGTACTGGTAATGCAAACTAATGG) or run the corresponding cell in hw1q2script.m.
Test
To Test the functions, use the attached HW1q2script.m script. It reads a se- quence from a sequence file sequence.fa, in FASTA format (which is one of the most popular and simplest format) and tests each of the functions above and output some statistics.
HERE THIS THE CODE GIVEN, PLEASE COMPLETE USING MATLAB. THANK YOU!
% This script tests all functions required in HW1.q2
%%
getReverseComp('ACGTGCA')
%%
findStartCodon('AATGTATGA')
%%
findStopCodon('ATAAGTAGGA')
%%
findLongestORF('GGAGGCGTAAAATGCGTACTGGTAATGCAAACTAATGG')
% the correct ORF starts at index 12 (for ATG at 12:14, and ends at 35, for
% TAA at 33:35.
%%
seq=fastaread('sequence.fa');
dna=seq.Sequence;
disp(['seq header: ', seq.Header])
str=sprintf('Base frequency on + strand: A %d C %d G %d T %d ', baseFreq(dna));
disp(str);
dna2=getReverseComp(dna);
str=sprintf('Base frequency on - strand: A %d C %d G %d T %d ', baseFreq(dna2));
disp(str);
%%
disp(sprintf('Number of possible start codons on + strand: %d ', length(findStartCodon(dna))))
disp(sprintf('Number of possible stop codons on + strand: %d ', length(findStopCodon(dna))))
disp(sprintf('Number of possible start codons on - strand: %d ', length(findStartCodon(dna2))))
disp(sprintf('Number of possible stop codons on - strand: %d ', length(findStopCodon(dna2))))
%%
%%
orf_pos = findLongestORF(dna);
orf_neg = findLongestORF(dna2);
disp('Longest ORF on + strand:')
disp(orf_pos);
disp('Longest ORF on - strand:')
disp(orf_neg);
%% in-script function to calc the frequency of ACGT.
function freq = baseFreq(dna)
bases = 'ACGT';
for i = 1:length(bases)
freq(i) = length(strfind(dna, bases(i)));
end
% converts to fraction
%freq = freq / sum(freq);
end
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started