Question
A 2003 paper in Genome Research conducted a study on the rates of nucleotide frequencies in human genes. I have summarized the distribution of nucleotides
A 2003 paper in Genome Research conducted a study on the rates of nucleotide frequencies in human genes.
I have summarized the distribution of nucleotides in the first exon, internal exon, and last exon and have placed them here:
| First | Internal | Last |
A | 18 | 25 | 26 |
C | 28 | 28 | 22 |
G | 33 | 26 | 23 |
T | 21 | 21 | 29 |
Part 1: Chi-Square Testing
Is there any reason to believe that there are pairwise differences in the exons? To explore this, please do the following:
Construct a barplot that shows the distribution of the nucleotides across the exons
Conduct a chi-square test between each of the pairs
Now consider the first intron and the internal intron
| First Intron | Internal Intron |
A | 28 | 28 |
C | 20 | 20 |
G | 22 | 21 |
T | 30 | 31 |
Conduct the same analysis.
Part 2: Sampling
Now that we understand any similarities or differences, lets go about creating some fake DNA strands from those five classifications. Make each strand include 300 nucleotides. Ensure that the DNA strands make sense relative to the distributions at an alpha level of .05.
Now create for loops to find how many samples it will take you to reach a circumstance where your fake DNA strands do not conform to the specifications of the groups. Use an alpha level of .01 for this task.
Below is my work so far:
# I have created the datafram I need to work with below. # Create dataframe from the table
df1=data.frame(nucleotide=c("A","C","G","T"), first.exon=c(28,20,22,30), internal.intron=c(28,20,21,31), last.exon=c(26,22,23,29)) # Print dataframe print(df1)
# Part 1: Chi-Square Testing # # Is there any reason to believe that there are pairwise differences in the exons? To explore this, please do the following: # 1. Construct a barplot that shows the distribution of the nucleotides across the exons
#below is what I have tried, but it is not working.
nucleotide = as.factor(df1$nucleotide)
barplot(nucleotide~df1$first.exon, df1$internal, df1$last.exon) # this is giving me a data type error.
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started