Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Write ( efficient without using hashtable, map etc.. ) code in C++ that does the following: find all the kmers in a given string print

Write (efficient without using hashtable, map etc..) code in C++ that does the following:

find all the kmers in a given string print the kmers with the number of time appers. (The code should work with any string with any size and any value of k. Assume that string only cotains the letter ,a,c,g,t).

suppose the length of the kmer is k=3, and the string given is ACTACT. The kmers are: ACT,CTA,TAC,ACT.

The output should be ACT 2, CTA 1, TAC 1

I am able to print out all the kmers but im not sure how to count them, and only print out the unique one's with the number of times repeated. The hint my teach gave is the following:

image text in transcribed

To implement your k-mer counter consider the following observation. Let A-0, C-1, G-2 and T-3 (i.e. think about DNA letters as digits). We can represent each k-mer as a number in base-4 system, which next can be converted to a regular base-10 index. For example, a 3-mer CGA can be represented as 1204 which is 24 in the decimal system (i.e. 2410. We can use this simple mechanism to assign index to each k-mer and use array to store counts of different k-mers. What should be the size of such count array? Notice that for a given k there are 4k possible correct k-mers. As long as k is small (and this is the case for this assignment) we can easily store count of all k-mers in the main memory. To implement your k-mer counter consider the following observation. Let A-0, C-1, G-2 and T-3 (i.e. think about DNA letters as digits). We can represent each k-mer as a number in base-4 system, which next can be converted to a regular base-10 index. For example, a 3-mer CGA can be represented as 1204 which is 24 in the decimal system (i.e. 2410. We can use this simple mechanism to assign index to each k-mer and use array to store counts of different k-mers. What should be the size of such count array? Notice that for a given k there are 4k possible correct k-mers. As long as k is small (and this is the case for this assignment) we can easily store count of all k-mers in the main memory

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Management With Website Development Applications

Authors: Greg Riccardi

1st Edition

0201743876, 978-0201743876

More Books

Students also viewed these Databases questions