Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

All parts please Q.3 The DNA of every gene is formed as a particular sequence of four possible bases, labelled A, C, G and T,

All parts please

image text in transcribed

Q.3 The DNA of every gene is formed as a particular sequence of four possible bases, labelled A, C, G and T, respectively. In a particular gene sub-sequence of length 1572, the following cooccurrence matrix was recorded: 185 101 69 161 74 41 45 103 86 6 34 100 171 115 78 202 Here, the (i, j)th count, cij, 1 i, j 4, is the number of transitions from base j to base i (with the bases ordered as listed above) that occur at adjacent locations along the sub-sequence. (a) A machine learning algorithm estimates the parameters of a homogeneous Markov chain (HMC) model of the gene DNA by consistently processing C. Using these estimates, address the following: (i) if all four bases are equiprobable at a particular position along the DNA, what is the probability that the base at the 3rd-next position is equal to the base at the 7th-next position? [25 %] (ii) if base A is found at a position along the DNA in the long-run, what is the probability that either base A or D occurs four positions earlier? [25 %] (b) A different machine learning algorithm assumes that the bases are independently and identically distributed (iid) at every position along the DNA. Once again consistently processing C, compute the following inferences: (i) the probability that the next base A will be observed between 5 and 9 positions from the current position. [25 %] (ii) the most probable numbers of times that the four bases will occur in a length-9 sequence, and the probability that these are the actual numbers that will occur. [25 %] Q.3 The DNA of every gene is formed as a particular sequence of four possible bases, labelled A, C, G and T, respectively. In a particular gene sub-sequence of length 1572, the following cooccurrence matrix was recorded: 185 101 69 161 74 41 45 103 86 6 34 100 171 115 78 202 Here, the (i, j)th count, cij, 1 i, j 4, is the number of transitions from base j to base i (with the bases ordered as listed above) that occur at adjacent locations along the sub-sequence. (a) A machine learning algorithm estimates the parameters of a homogeneous Markov chain (HMC) model of the gene DNA by consistently processing C. Using these estimates, address the following: (i) if all four bases are equiprobable at a particular position along the DNA, what is the probability that the base at the 3rd-next position is equal to the base at the 7th-next position? [25 %] (ii) if base A is found at a position along the DNA in the long-run, what is the probability that either base A or D occurs four positions earlier? [25 %] (b) A different machine learning algorithm assumes that the bases are independently and identically distributed (iid) at every position along the DNA. Once again consistently processing C, compute the following inferences: (i) the probability that the next base A will be observed between 5 and 9 positions from the current position. [25 %] (ii) the most probable numbers of times that the four bases will occur in a length-9 sequence, and the probability that these are the actual numbers that will occur. [25 %]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Accounting questions