Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

A 2003 paper in Genome Research conducted a study on the rates of nucleotide frequencies in human genes. I have summarized the distribution of nucleotides

A 2003 paper in Genome Research conducted a study on the rates of nucleotide frequencies in human genes.

I have summarized the distribution of nucleotides in the first exon, internal exon, and last exon and have placed them here:

First

Internal

Last

A

18

25

26

C

28

28

22

G

33

26

23

T

21

21

29

Part 1: Chi-Square Testing

Is there any reason to believe that there are pairwise differences in the exons? To explore this, please do the following:

Construct a barplot that shows the distribution of the nucleotides across the exons

Conduct a chi-square test between each of the pairs

Now consider the first intron and the internal intron

First Intron

Internal Intron

A

28

28

C

20

20

G

22

21

T

30

31

Conduct the same analysis.

Part 2: Sampling

Now that we understand any similarities or differences, lets go about creating some fake DNA strands from those five classifications. Make each strand include 300 nucleotides. Ensure that the DNA strands make sense relative to the distributions at an alpha level of .05.

Now create for loops to find how many samples it will take you to reach a circumstance where your fake DNA strands do not conform to the specifications of the groups. Use an alpha level of .01 for this task.

Below is my work so far:

# I have created the datafram I need to work with below. # Create dataframe from the table

df1=data.frame(nucleotide=c("A","C","G","T"), first.exon=c(28,20,22,30), internal.intron=c(28,20,21,31), last.exon=c(26,22,23,29)) # Print dataframe print(df1)

# Part 1: Chi-Square Testing # # Is there any reason to believe that there are pairwise differences in the exons? To explore this, please do the following: # 1. Construct a barplot that shows the distribution of the nucleotides across the exons

#below is what I have tried, but it is not working.

nucleotide = as.factor(df1$nucleotide)

barplot(nucleotide~df1$first.exon, df1$internal, df1$last.exon) # this is giving me a data type error.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions