Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

In this exercise we work with next generation sequencing (NGS) data. Unix is excellent at manipulating the huge FASTA files that are generated in NGS

In this exercise we work with next generation sequencing (NGS) data. Unix is excellent at manipulating the huge FASTA files that are generated in NGS experiments. FASTA files contain sequence data in text format. Each sequence segment is preceded by a single-line description. The first character of the description line is a greater than sign (>).15 The NGS data set we will be working with was published by Marra and DeWoody (2014), who investigated the immunogenetic repertoire of rodents. You will find the sequence file Marra2014_data.fasta in the directory CSB/unix/data. The file contains sequence segments (contigs) of variable size. The description of each contig provides its length, the number of reads that contributed to the contig, its isogroup (representing the collection of alternative splice products of a possible gene), and the isotig status.

1. Change directory to CSB/unix/sandbox.

2. What is the size of the file Marra2014_data.fasta?

3. Create a copy of Marra2014_data.fasta in the sandbox and name it my_file.fasta.

4. How many contigs are classified as isogroup00036?

5. Replace the original two-spaces delimiter with a comma.

6. How many unique isogroups are in the file?

7. Which contig has the highest number of reads (numreads)? How many reads does it have?

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Conceptual Database Design An Entity Relationship Approach

Authors: Carol Batini, Stefano Ceri, Shamkant B. Navathe

1st Edition

0805302441, 978-0805302448

More Books

Students also viewed these Databases questions

Question

How do Excel Pivot Tables handle data from non OLAP databases?

Answered: 1 week ago