Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

report_countGB.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313459_1/xid-14313459_1 report_longGB.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313460_1/xid-14313460_1 report_shortGB.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313461_1/xid-14313461_1 sequence_long.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313462_1/xid-14313462_1 sequence_short.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313463_1/xid-14313463_1 Background: The source for the information used in project is an article that appeared in

image text in transcribedimage text in transcribedimage text in transcribedimage text in transcribed

report_countGB.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313459_1/xid-14313459_1

report_longGB.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313460_1/xid-14313460_1

report_shortGB.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313461_1/xid-14313461_1

sequence_long.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313462_1/xid-14313462_1

sequence_short.txt https://blackboard.wichita.edu/bbcswebdav/pid-1705165-dt-content-rid-14313463_1/xid-14313463_1

Background: The source for the information used in project is an article that appeared in a magazine IEEE Spectrum in July 2013 (http://spectrum.ieee.ora/biomedical/devices/the-dna-data-deluge ) and from the Human Genome Project (http://www.genome.gov/10001772) The human genome project, a project to map all of the human genes (about 24,000) started in October 1990 and was completed in April 2003. There are four different types of DNA nucleotides: adenine (A), thymine (T), cytosine (C), and guanine (G) and the human DNA has roughly 3 billion nucleotides. These are grouped into triplets called codons which is a sequence of three nucleotides. These codons allow the body to code for the 20 nucleic acids that are used to build all of the proteins in the body. One specific codon, TAC, indicates the start of a gene. Any of the three codons, ACT, ATT, and ATC, can indicate the end of a gene Sequencing a human genome is a computationally intensive process. The first steps begin in a test tube: the DNA strand is split down the middle, it is copied many times, then split into much shorter segments. A machine called a sequencer then identifies the string of nucleotides in each fragment. A computer or a set of computers then takes all of the strands and tries to re- order them using the existing human genome as a reference. The difficulty is that there are hundreds of thousands of strands completely out of order with many overlapping and much repetition. Developing algorithms to take the sequencer data and sort it accurately and efficiently while keeping the cost down is an active area of research Section A: Genome Indexin In this Section, you will take a look at one of the approaches for sorting the sequencer data based on indexing the genome. In Genome Indexing, we look for key sequences of nucleotides and record their locations in the genome. For example, the sequence 'GATTACA' occurs roughly 697,000 times in the human genome The picture on the following page (taken from the IEEE Spectrum article), illustrates the idea As illustrated in the picture, we scan through the genome looking for some key triplets, in this case: 'AAA','ATC, and CGG. when a codon is found, the location in the genome is recorded Background: The source for the information used in project is an article that appeared in a magazine IEEE Spectrum in July 2013 (http://spectrum.ieee.ora/biomedical/devices/the-dna-data-deluge ) and from the Human Genome Project (http://www.genome.gov/10001772) The human genome project, a project to map all of the human genes (about 24,000) started in October 1990 and was completed in April 2003. There are four different types of DNA nucleotides: adenine (A), thymine (T), cytosine (C), and guanine (G) and the human DNA has roughly 3 billion nucleotides. These are grouped into triplets called codons which is a sequence of three nucleotides. These codons allow the body to code for the 20 nucleic acids that are used to build all of the proteins in the body. One specific codon, TAC, indicates the start of a gene. Any of the three codons, ACT, ATT, and ATC, can indicate the end of a gene Sequencing a human genome is a computationally intensive process. The first steps begin in a test tube: the DNA strand is split down the middle, it is copied many times, then split into much shorter segments. A machine called a sequencer then identifies the string of nucleotides in each fragment. A computer or a set of computers then takes all of the strands and tries to re- order them using the existing human genome as a reference. The difficulty is that there are hundreds of thousands of strands completely out of order with many overlapping and much repetition. Developing algorithms to take the sequencer data and sort it accurately and efficiently while keeping the cost down is an active area of research Section A: Genome Indexin In this Section, you will take a look at one of the approaches for sorting the sequencer data based on indexing the genome. In Genome Indexing, we look for key sequences of nucleotides and record their locations in the genome. For example, the sequence 'GATTACA' occurs roughly 697,000 times in the human genome The picture on the following page (taken from the IEEE Spectrum article), illustrates the idea As illustrated in the picture, we scan through the genome looking for some key triplets, in this case: 'AAA','ATC, and CGG. when a codon is found, the location in the genome is recorded

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions