Question
Suppose a given string sequence AGCGTNCGTNGTACGTNACGTNAACGTN the string consists of only one line and all uppercase alphabets A,G,T,C,N Here A,G,T,C are considered valid characters and
Suppose a given string sequence "AGCGTNCGTNGTACGTNACGTNAACGTN"
the string consists of only one line and all uppercase alphabets A,G,T,C,N
Here A,G,T,C are considered valid characters and N is like a break point.
I have to make k-mer sequences. (k >= 2 && k <=10).
suppose the sequence "ACGNACGCTN"
suppose k = 3
I have to output the unique kmer formed in the sequence and the number of times it appears in the sequence.
like
ACG 2
CGC 1
GCT 1
If you encounter an N in the sequence, you simply check for the kmer from position (N+1) and so on.
I have to implement this using arrays or vectors.
The string sequence file can be over 100mb in size so please keep in check with algorithm runtimes (like dont use any nested for loops)
No hashmaps or any use of dictionaries like maps.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started