Question

1 Approved Answer

Posted on Oct 16, 2024

Implement a single-source file program called DNA Processor, which will perform text operations on a String representing a DNA sequence. 1. A DNA sequence consists

Implement a single-source file program called DNA Processor, which will perform text operations on a String representing a DNA sequence.

1. A DNA sequence consists of a restricted alphabet of 'A', 'T', 'G' and 'C', which are known as bases.

2. Because DNA sequencing technology is imperfect the sequence given to your program may include invalid characters (which your program will replace with a special character meaning 'unknown')

3. DNA strands are actually made up of two complementary sequences, with strict rules about

which bases appear with which other bases (A only ever pairs with T, G only ever pairs with C).

Your program will be able to switch between the original sequence and its complement (more

details are given later)

4. A common operation on a DNA sequence is transcription, where a part of the sequence is

converted into messenger RNA (mRNA), which is an intermediate step to translating the code

into proteins (this second step is beyond your program). RNA, which is an older form of genetic

material than DNA, uses the same alphabet as DNA except that T is replaced by U.

The program should implement the following high-level algorithm:

Program: DNA Processor

Steps:

1 Display the program's name

2 Prompt the user to enter a DNA sequence (any non-whitespace characters are allowed)

3 Convert that String to upper case, replace all non-DNA characters with a dot (.), and display the error rate (details below)

4Do

4-1| Display a menu of operations for working with that DNA sequence

5While the user has not selected quit

After step 3, only the 'cleaned' sequence of characters will be used. For example, if the user typed 'gaxTT/aca' then it will be displayed later as 'GA.TT.ACA'. For each operation that displays some information the program should present that information with some suitable prefix, such as

'Sequence: '

The available operations are:

Display the current value of the sequence.

Display the error rate,

Transcribe the entire sequence, which should display the mRNA equivalent of the entire

sequence (i.e., the sequence with all Ts converted to Us)

Transcribe a section of the sequence, between two points. The user should be asked to give the

start and end of the section to transcribe, which will be values between 1 and the length of the

sequence, inclusive. Both the start and end should be included in the transcribed section so,

given the example above, if the user asked to transcribe between positions 2 and 5 the program

will produce 'A.UU'.

Switch to the sequence's complement, which should change the current value of the sequence to

its complementary sequence and then display the new value. The rules to apply are: A becomes

T, T becomes A, G becomes C, C becomes G, . remains unchanged. For example, the sequence

above would change to CT.AA.TGT (until the user chooses to switch to the complement again,

sometime later)

Cleaning the input

Strings have a replace All(String regex, String replacement) method that will replace all substrings that match the given regular expression (a pattern that can match a variety of String values) with the given replacement. for example

1. "[XYZ]" matches any single character from the set 'X', 'Y' or 'Z'

2. "[^XYZ]" matches any single character that is not in the the set 'X', 'Y' or 'Z'

a loop to clean the input text.

Calculating the error rate

Transcribing the sequence

Note : Use commenting, coding style for layout, variable names (use of case),