Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

This page is taken from Bishop Springer, Pattern Recognition and Machine Learning. Please help me to derive and prove equation 1.70, 1.71 and 1.72 from

This page is taken from Bishop Springer, Pattern Recognition and Machine Learning. Please help me to derive and prove equation 1.70, 1.71 and 1.72 from 1.69.
image text in transcribed
In the curve fitting problem, we are given the training data X and t, along with a new test point x, and our goal is to predict the value of t. We therefore wish to evaluate the predictive distribution p(tx,x,t). Here we shall assume that the parameters and are fixed and known in advance (in later chapters we shall discuss how such parameters can be inferred from data in a Bayesian setting). A Bayesian treatment simply corresponds to a consistent application of the sum and product rules of probability, which allow the predictive distribution to be written in the form p(tx,x,t)=p(tx,w)p(wx,t)dw. Here p(tx,w) is given by (1.60), and we have omitted the dependence on and to simplify the notation. Here p(wx,t) is the posterior distribution over parameters, and can be found by normalizing the right-hand side of (1.66). We shall see in Section 3.3 that, for problems such as the curve-fitting example, this posterior distribution is a Gaussian and can be evaluated analytically. Similarly, the integration in ( 1.68) can also be performed analytically with the result that the predictive distribution is given by a Gaussian of the form p(tx,x,t)=N(tm(x),s2(x)) where the mean and variance are given by m(x)=(x)TSn=1N(xn)tns2(x)=1+(x)TS(x). Here the matrix S is given by S1=I+n=1N(xn)(x)T In the curve fitting problem, we are given the training data X and t, along with a new test point x, and our goal is to predict the value of t. We therefore wish to evaluate the predictive distribution p(tx,x,t). Here we shall assume that the parameters and are fixed and known in advance (in later chapters we shall discuss how such parameters can be inferred from data in a Bayesian setting). A Bayesian treatment simply corresponds to a consistent application of the sum and product rules of probability, which allow the predictive distribution to be written in the form p(tx,x,t)=p(tx,w)p(wx,t)dw. Here p(tx,w) is given by (1.60), and we have omitted the dependence on and to simplify the notation. Here p(wx,t) is the posterior distribution over parameters, and can be found by normalizing the right-hand side of (1.66). We shall see in Section 3.3 that, for problems such as the curve-fitting example, this posterior distribution is a Gaussian and can be evaluated analytically. Similarly, the integration in ( 1.68) can also be performed analytically with the result that the predictive distribution is given by a Gaussian of the form p(tx,x,t)=N(tm(x),s2(x)) where the mean and variance are given by m(x)=(x)TSn=1N(xn)tns2(x)=1+(x)TS(x). Here the matrix S is given by S1=I+n=1N(xn)(x)T

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Financial Accounting

Authors: Robert Libby, Patricia Libby, Frank Hodge

10th Edition

1260481352, 978-1260481358

More Books

Students also viewed these Accounting questions