Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

create a Java programming language with the following information? In a FASTA format DNA sequence file, a sequence record starts with a header line beginning

create a Java programming language with the following information? 

In a FASTA format DNA sequence file, a sequence record starts with a header line beginning with a ">" sign, and followed by a sequence identifier (such as GenBank accession number) and a description about the sequence. Develop a Java program to read in a sequence file. This is the sequence file:

>by21f03.y1|BF727444
CACCAGCTCAGCACCGCCGTGCGCCCAGCCAGCCATGGGGAATTCACCCC
TCTACGAGGACCGGGGCTTCCAGGGCCGCCACTACGAATGCAGCAGCGAC
CACCCCAACCTGCAGCCCTACTTGAGCCGCTGCAACTCGGCGCGCGTGGA
CAGCGGCTGCTGGATGCTCTGGAATTCCAGCCCAACTACTCGGGCCTCCA
ACTTCCTGCGCCGCGGCGACTATGCCGACCACCAGCAGTGGATGGGCCTC
AGCGACTCGGTCCGCTCCTGCCGCCTCATCCCCCACTCTGGCTCTCACAG
GATCAGACTCTATGAGAGGGAGGACTACAGAGGCCAGATGATAGAGTTCA
CTGAGGACTGCTCCTGGAATTCAGGACCGCT
>by05e12.y1|BF726365
CCGCCGTGCGCCCAGCCAGCCATGGGGAAGATCACCCTCTACGAGGACCG
GGGCTTCCAGGGCCGCCACTACGAATGCAGCAGCGACCACCCCAACCTGC
AGCCCTACTTGAGGAATTCGAACTCGGCGCGCGTGGACAGCGGCTGCTGG
ATGCTCTATGAGCAGCCCAACTACTCGGGCCTCCAGTACTTCCTGCGCCG
CGGCGACTATGCCGACCACCAGCAGTGGATGGGCCTCAGCGACTCGGTCC
GCTCCTGCCGCCTC
>by09f05.y1|BF726635
CACCAGCTCAGCACCGCCGTGCGCCCAGCCAGCCATGGGGAAGATCACCC
TCTACGAGGACCGGGGCTTCCAGGGCCGCCACTACGAATGCAGCAGCGAC
CACCCCAACCTGCGGAATTCCTTGAGCCGCTGCAACTCGGCGCGCGTGGA
CAGCGGCTGCTGGATGCTCTATGAGCAGCCCAACTACTCGGGCCTCCAGT
ACTTCCTGCGCCGCGGCGACTATGCCGACCACCAGCAGTGGATGGGCCTC
AGCGACTCGGTCCGCTCCTGCCGCCTCATCCCCCACTCTGGCTCTCACAG
GATCAGACTCTATGAGGAATTCCCCTACAGAGGCCAGATGATAGAGTTCA
CTGAGGACTGCTCCTGTCTTCAGGACCGCTTCCGCTTCAATGAAATCCAC
TCCCTCAACGTGCTGGAGGGCTCCTGGGTCCTCTACGAGCTGTCCAACTA
CCGAGGACGGCAGTACCTG
>by14f12.y1|BF726960
CAGCTCAGCACCGCCGTGCGCCCAGCCAGCCATGGGGAAGATCACCCTCT
ACGAGGACCGGGGCTTCCAGGGCCGCCACTACGAATGCAGCAGCGACCAC
CCCAACCTGCAGCCCTACTTGAGCCGCTGCAACTCGGCGCGCGTGGACAG
CGGCTGCTGGATGCTCTATGAGCAGCCCAACTACTCGGGCCTCCAGTACT
TCCTGCGCCGCGGCGACTATGGAATTCGGCAGCAGTGGATGGGCCTCAGC
GACTCGGTCCGCTCCTGCCGCCTCATCCCCCACTCTGGCTCTCACAGGAT
CAGACTCTATGAGAGGGAGGACTACAGAGGCCAGATGATAGAGTTCACTG
AGGAC
>by20g06.y1|BF727389
CAGCTCAGCACCGCCGTGCGCCCAGCCAGCCATGGGGAAGATCACCCTCT
ACGAGGACCGGGGCTTCCAGGGCCGCCACTACGAATGCAGCAGCGACCAC
CCCAACCTGCAGCCCTACTTGAGCCGCTGCAACTCGGCGCGCGTGGACAG
CGGCTGCTGGATGCTCTATGAGCAGCCCAACTACTCGGGCCTCCAGTACT
TCCTGCGCCGCGGCGACTATGCCGACCACCAGCAGTGGATGGGCCTCAGC
GACTCGGTCCGCTCCTGCCGCCTCATCCCCCACTCTGGCTCTCACAGGAT
CAGACTCTATGAGAGGGAGGACTACAGAGGCCAGATGATAGAGTTCACTG
AGGACTGCTCCTGTC
>by18g06.y1|BF727241
CGCGAGCCTCTACGAGGACCGGGGCTTCCAGGGCCGCCACTACGAATGCA
GCAGCGACCACCCCAACCTGCAGCCCTACTTGAGCCGCTGCAACTCGGCG
CGCGTGGACAGCGGCTGC

 and find out how may sequences are in the file (count the number of the header line). The program should prompt the user for the sequence file name, and then print a message to state how many sequences are contained in the file, such as: 

Enter the name of the sequence file: seq.fasta
File seq.fasta contains 6 sequences

 

In the above In-Class exercise, you need to read through the whole file to determine the number of head lines. So you can separate the actual sequence from the head line for each sequence record. Please modify the above program to search through the sequence of each record for any restriction site. Underline the restriction sites with "*"s. See a sample output below:

Enter the name of the sequence file: seq.fasta
Enter the sequence of a restriction site: GAATTC

>by21f03.y1|BF727444
CACCAGCTCAGCACCGCCGTGCGCCCAGCCAGCCATGGGGAATTCACCCC
                                       ******  
TCTACGAGGACCGGGGCTTCCAGGGCCGCCACTACGAATGCAGCAGCGAC
                                                 
CACCCCAACCTGCAGCCCTACTTGAGCCGCTGCAACTCGGCGCGCGTGGA
                                                 
CAGCGGCTGCTGGATGCTCTGGAATTCCAGCCCAACTACTCGGGCCTCCA
                     ******                      
ACTTCCTGCGCCGCGGCGACTATGCCGACCACCAGCAGTGGATGGGCCTC
                                                 
AGCGACTCGGTCCGCTCCTGCCGCCTCATCCCCCACTCTGGCTCTCACAG
                                                 
GATCAGACTCTATGAGAGGGAGGACTACAGAGGCCAGATGATAGAGTTCA
                                                 
CTGAGGACTGCTCCTGGAATTCAGGACCGCT
                ******        
 

>by05e12.y1|BF726365
CCGCCGTGCGCCCAGCCAGCCATGGGGAAGATCACCCTCTACGAGGACCG
                                                 
GGGCTTCCAGGGCCGCCACTACGAATGCAGCAGCGACCACCCCAACCTGC
                                                 
AGCCCTACTTGAGGAATTCGAACTCGGCGCGCGTGGACAGCGGCTGCTGG
             ******                              
ATGCTCTATGAGCAGCCCAACTACTCGGGCCTCCAGTACTTCCTGCGCCG
                                                 
CGGCGACTATGCCGACCACCAGCAGTGGATGGGCCTCAGCGACTCGGTCC
                                                 
GCTCCTGCCGCCTC

 

Step by Step Solution

3.43 Rating (159 Votes )

There are 3 Steps involved in it

Step: 1

Sure here is a Java program to read in a FASTA format DNA sequence file find out how many sequences are in the file and search through the sequence of ... blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Systems analysis and design

Authors: kenneth e. kendall, julie e. kendall

8th Edition

135094909, 013608916X, 9780135094907, 978-0136089162

More Books

Students also viewed these Programming questions

Question

What is the meaning of the phrase the planning game?

Answered: 1 week ago

Question

In which two situations are decision trees preferable?

Answered: 1 week ago