Question
Write ( efficient without using hashtable, map etc.. )code in C++ that does the following: find all the kmers in a given string print the
Write (efficient without using hashtable, map etc..)code in C++ that does the following:
find all the kmers in a given string print the kmers with the number of time appers. (The code should work with any string with any size and any value of k. Assume that string only cotains the letter ,a,c,g,t).
suppose the length of the kmer is k=3, and the string given is ACTACT. The kmers are: ACT,CTA,TAC,ACT.
The output should be ACT 2, CTA 1, TAC 1
I am able to print out all the kmers but im not sure how to count them, and only print out the unique one's with the number of times repeated. The hint my teach gave is the following:
To implement your k-mer counter consider the following observation. Let A-0, C-1, G-2 and T-3 (i.e. think about DNA letters as digits). We can represent each k-mer as a number in base-4 system, which next can be converted to a regular base-10 index. For example, a 3-mer CGA can be represented as 1204 which is 24 in the decimal system (i.e. 2410. We can use this simple mechanism to assign index to each k-mer and use array to store counts of different k-mers. What should be the size of such count array? Notice that for a given k there are 4k possible correct k-mers. As long as k is small (and this is the case for this assignment) we can easily store count of all k-mers in the main memory. To implement your k-mer counter consider the following observation. Let A-0, C-1, G-2 and T-3 (i.e. think about DNA letters as digits). We can represent each k-mer as a number in base-4 system, which next can be converted to a regular base-10 index. For example, a 3-mer CGA can be represented as 1204 which is 24 in the decimal system (i.e. 2410. We can use this simple mechanism to assign index to each k-mer and use array to store counts of different k-mers. What should be the size of such count array? Notice that for a given k there are 4k possible correct k-mers. As long as k is small (and this is the case for this assignment) we can easily store count of all k-mers in the main memoryStep by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started