Question
All code must be written in C++ (C++ was also used in CS1, CS2, and CS3). You should submit The source codes. A document showing
All code must be written in C++ (C++ was also used in CS1, CS2, and CS3). You should submit The source codes. A document showing the output for test cases.
Put all the individual files in one single folder and compress the folder. Upload the compressed folder. Files should include your first and last names.
1. Implement a function to compute the Hamming Distance between two DNA sequences. The function needs two DNA sequences as inputs. The output should be the Hamming distance between the two input DNA sequences [10 points]. Some test cases are i. Output = 2, when inputs are AAAAAA and ACAAAC ii. Output = 1, when inputs are AAAAAA and ACAAAA iii. Output = 6, when inputs are AAAAAA and TCGCTG 2. Implement a function to compute the TotalDistance(v,DNA) method in the context of median string finding problem. [30 points] The function requires two inputs. v: a given k-mer. DNA = a set of DNA sequences. You can use array to store the DNA sequences. The function should have two outputs Best match (along with the position: use 0-based meaning 0 indicates first position) of the given k-mer (v) in each DNA sequence in DNA. The distance. Test Case 1: For the 10 DNA sequences in file Sequences.fa given in BB, and v = TAGATCCGAA, a sample output is. Bestmatch Pos Hamming Distance TAGATCTGAA 17 1 TGGATCCGAA 47 1 TAGACCCGAA 18 1 TAAATCCGAA 33 1 TAGGTCCAAA 21 2 TAGATTCGAA 0 1 CAGATCCGAA 46 1 TAGATCCGTA 70 1 TAGATCCAAA 16 1 TCGATCCGAA 65 1 Total distance = 11 Test Case 2:
For the 10 DNA sequences in file Sequences.fa given in BB, and v = AAAAAA, a sample output is. Bestmatch Pos Hamming Distance AAAGTA 30 2 ACCAAA 34 2 GAAATA 25 2 AAACAA 17 1 CCAAAA 26 2 AATCGA 8 3 GAAATG 0 3 ACAAGA 54 2 ATCCAA 19 3 AAAACA 10 1 Total distance = 21 3. Implement the Brute Force Median String Search algorithm [No need to use a tree]. Test the given algorithm for the cases of 4-mer, 5-mer and 6-mer with the given Sequences.fa file. [60 points] Test Case 1: For sequences in Sequences.fa, the motif of length = 4 is GATC Test Case 2: For sequences in Sequences.fa, the motif of length = 5 is CCGAA Test Case 3: For sequences in Sequences.fa, the motif of length = 6 is TCCGAA
--------------------------------------------------------------------------
Output for sequence
>1 TAGTGGTCTTTTGAGTGTAGATCTGAAGGGAAAGTATTTCCACCAGTTCGGGGTCACCCAGCAGGGCAGGGTGACTTAAT >2 CGCGACTCGGCGCTCACAGTTATCGCACGTTTAGACCAAAACGGAGTTGGATCCGAAACTGGAGTTTAATCGGAGTCCTT >3 GTTACTTGTGAGCCTGGTTAGACCCGAAATATAATTGTTGGCTGCATAGCGGAGCTGACATACGAGTAGGGGAAATGCGT >4 AACATCAGGCTTTGATTAAACAATTTAAGCACGTAAATCCGAATTGACCTGATGACAATACGGAACATGCCGGCTCCGGG >5 ACCACCGGATAGGCTGCTTATTAGGTCCAAAAGGTAGTATCGTAATAATGGCTCAGCCATGTCAATGTGCGGCATTCCAC >6 TAGATTCGAATCGATCGTGTTTCTCCCTCTGTGGGTTAACGAGGGGTCCGACCTTGCTCGCATGTGCCGAACTTGTACCC >7 GAAATGGTTCGGTGCGATATCAGGCCGTTCTCTTAACTTGGCGGTGCAGATCCGAACGTCTCTGGAGGGGTCGTGCGCTA >8 ATGTATACTAGACATTCTAACGCTCGCTTATTGGCGGAGACCATTTGCTCCACTACAAGAGGCTACTGTGTAGATCCGTA >9 TTCTTACACCCTTCTTTAGATCCAAACCTGTTGGCGCCATCTTCTTTTCGAGTCCTTGTACCTCCATTTGCTCTGATGAC >10 CTACCTATGTAAAACAACATCTACTAACGTAGTCCGGTCTTTCCTGATCTGCCCTAACCTACAGGTCGATCCGAAATTCG
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started