Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

BIOINFORMATICS I need to use Galaxy to set up an experiment on some given files, but I am unsure how to begin. I am distributing

BIOINFORMATICS

I need to use Galaxy to set up an experiment on some given files, but I am unsure how to begin.

  1. I am distributing a Galaxy History called: CG19FinalProjectDNADatasets containing:
    • Sacce1_AssemblyScaffolds.fasta <- Saccharomyces cerevisiae Wild-Type Genome
    • Sacce1.ExternalModels.gff3 <- Saccharomyces cerevisiae Wild-Type Genome Annotations
    • Scerevisiae_RAMutant_Genome.v02.fa <- Saccharomyces cerevisiae Mutant Genome
    • Sacce1_chrVI-VII-VIII.fasta <- Saccharomyces cerevisiae Chromosomes VI/VII/VIII Wild-Type Genome
    • All Datasets are from the Wild-Type Genome
    • Dataset01:
      • SRR352492.sra_1.fastq
      • SRR352492.sra_2.fastq
    • Dataset02:
      • SRR352384.sra_1.fastq
      • SRR352384.sra_2.fastq
    • Dataset03:
      • SRR452441.csra_1.fastq
      • SRR452441.csra_2.fastq
    • The Datasets have already been Quality Controlled
    • Saccharomyces cerevisiae Wild-Type Chromosomes VI/VII/VIII is being provided for you to test/optimize your mapping conditions
  2. Your task is:
    • To use these NGS Datasets to identify any polymorphisms that might be present in theSaccharomyces cerevisiae Mutant Genome
    • You are welcome to validate your findings using other methods (e.g., alignments, Blast, etc), but your primary conclusions must be based on NGS Reads mapping
    • Your findings must also be validated by IGV figures displaying the presence/absence of the polymorphisms you have identified. In other words, each polymorphism claim must be supported by an IGV figure
    • All your findings must be properly documented
    • A Final Project Report and Final Project Presentation must be prepared according to the instructions outlined below
  3. Recommendations:
    • Start by looking at your data, study the data
    • Design a logical and simple Experimental Strategy. Remember there is no a single way to address this problem
    • Learn how to use and try to understand the following tools in Galaxy:
      • Bowtie map reads against reference genome (output = unsorted sam file)
      • Bowtie2 - map reads against reference genome (output = sorted bam file)
      • After you settle on Bowtie and/or Bowtie2, do not change this parameter
      • Pay special attention to the options provided to generate files containing aligned and/or unaligned reads. Remember, you might need to use those parameters to demonstrate your assumptions or test your hypothesis
      • sort a BAM file <= Sorting is imperative after the Bam file is generated and before other processing can occur
      • Flagstat tabulate descriptive stats for BAM dataset
      • BAM mapping statistics samtools idxstats
      • Filter BAM datasets on a variety of attributes
      • BAM-to-SAM convert BAM to SAM and SAM-to-BAM convert SAM to BAM (when needed)
    • I suggest using the test chromosome you must decide on which one of these tools is best for you task
    • Display the data on IGV
      • When using BAM files, remember to download, in addition to the BAM file, its accompanying BAM_INDEX file
    • If you need to assemble reads, let me know, and I will show you how to do it
    • Include Controls

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions