Answered step by step
Verified Expert Solution
Question
1 Approved Answer
BIOINFORMATICS I need to use Galaxy to set up an experiment on some given files, but I am unsure how to begin. I am distributing
BIOINFORMATICS
I need to use Galaxy to set up an experiment on some given files, but I am unsure how to begin.
- I am distributing a Galaxy History called: CG19FinalProjectDNADatasets containing:
- Sacce1_AssemblyScaffolds.fasta <- Saccharomyces cerevisiae Wild-Type Genome
- Sacce1.ExternalModels.gff3 <- Saccharomyces cerevisiae Wild-Type Genome Annotations
- Scerevisiae_RAMutant_Genome.v02.fa <- Saccharomyces cerevisiae Mutant Genome
- Sacce1_chrVI-VII-VIII.fasta <- Saccharomyces cerevisiae Chromosomes VI/VII/VIII Wild-Type Genome
- All Datasets are from the Wild-Type Genome
- Dataset01:
- SRR352492.sra_1.fastq
- SRR352492.sra_2.fastq
- Dataset02:
- SRR352384.sra_1.fastq
- SRR352384.sra_2.fastq
- Dataset03:
- SRR452441.csra_1.fastq
- SRR452441.csra_2.fastq
- The Datasets have already been Quality Controlled
- Saccharomyces cerevisiae Wild-Type Chromosomes VI/VII/VIII is being provided for you to test/optimize your mapping conditions
- Your task is:
- To use these NGS Datasets to identify any polymorphisms that might be present in theSaccharomyces cerevisiae Mutant Genome
- You are welcome to validate your findings using other methods (e.g., alignments, Blast, etc), but your primary conclusions must be based on NGS Reads mapping
- Your findings must also be validated by IGV figures displaying the presence/absence of the polymorphisms you have identified. In other words, each polymorphism claim must be supported by an IGV figure
- All your findings must be properly documented
- A Final Project Report and Final Project Presentation must be prepared according to the instructions outlined below
- Recommendations:
- Start by looking at your data, study the data
- Design a logical and simple Experimental Strategy. Remember there is no a single way to address this problem
- Learn how to use and try to understand the following tools in Galaxy:
- Bowtie map reads against reference genome (output = unsorted sam file)
- Bowtie2 - map reads against reference genome (output = sorted bam file)
- After you settle on Bowtie and/or Bowtie2, do not change this parameter
- Pay special attention to the options provided to generate files containing aligned and/or unaligned reads. Remember, you might need to use those parameters to demonstrate your assumptions or test your hypothesis
- sort a BAM file <= Sorting is imperative after the Bam file is generated and before other processing can occur
- Flagstat tabulate descriptive stats for BAM dataset
- BAM mapping statistics samtools idxstats
- Filter BAM datasets on a variety of attributes
- BAM-to-SAM convert BAM to SAM and SAM-to-BAM convert SAM to BAM (when needed)
- I suggest using the test chromosome you must decide on which one of these tools is best for you task
- Display the data on IGV
- When using BAM files, remember to download, in addition to the BAM file, its accompanying BAM_INDEX file
- If you need to assemble reads, let me know, and I will show you how to do it
- Include Controls
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started