Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This question is based on a real life problem and a common situation in data analysis. You have a folder (directory) of fasta (DNA sequence)

This question is based on a real life problem and a common situation in data analysis. You have a folder (directory) of fasta (DNA sequence) data files here. (Note that the folder is a compressed zip file than you need to uncompress). You also have a master spreadsheet (header_changes.csv) here. Each fasta file has a header line, but these need to be changed. Each row in header_changes.csv pertains to a file and the first item in that row is the file name. Then a comma, then what should be the current header (you might want to check to see if that is correct, and even deliberately insert one with the wrong header to make sure it can detect that!). The third row is the new header name. So your job is to replace the headers in all files correctly. You might want to make sure that instead of writing over the old files, you save the new files in a new directory. You can solve this in python, bash, or any combination of the 2. You must include some documentation so that anyone else can run your code easily and understand what is going on..

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Graph Databases In Action

Authors: Dave Bechberger, Josh Perryman

1st Edition

1617296376, 978-1617296376

Students also viewed these Databases questions