All Matches
Solution Library
Expert Answer
Textbooks
Search Textbook questions, tutors and Books
Oops, something went wrong!
Change your search query and then try again
Toggle navigation
FREE Trial
S
Books
FREE
Tutors
Study Help
Expert Questions
Accounting
General Management
Mathematics
Finance
Organizational Behaviour
Law
Physics
Operating System
Management Leadership
Sociology
Programming
Marketing
Database
Computer Network
Economics
Textbooks Solutions
Accounting
Managerial Accounting
Management Leadership
Cost Accounting
Statistics
Business Law
Corporate Finance
Finance
Economics
Auditing
Ask a Question
AI Study Help
New
Search
Search
Sign In
Register
study help
business
pattern recognition and machine learning
Questions and Answers of
Pattern Recognition And Machine Learning
2.28 ( ) www Consider a joint distribution over the variable z =x y(2.290)whose mean and covariance are given by (2.108) and (2.105) respectively. By making use of the results (2.92) and (2.93)
2.27 () Let x and z be two independent random vectors, so that p(x, z) = p(x)p(z).Show that the mean of their sum y = x+z is given by the sum of the means of each of the variable separately.
2.26 ( ) A very useful result from linear algebra is the Woodbury matrix inversion formula given by(A + BCD)−1 = A−1 − A−1B(C−1 + DA−1B)−1DA−1. (2.289)By multiplying both sides by (A
2.25 ( ) In Sections 2.3.1 and 2.3.2, we considered the conditional and marginal distributions for a multivariate Gaussian. More generally, we can consider a partitioning of the components of x into
2.24 ( ) www Prove the identity (2.76) by multiplying both sides by the matrixA B C D(2.287)and making use of the definition (2.77).
2.23 ( ) By diagonalizing the coordinate system using the eigenvector expansion (2.45), show that the volume contained within the hyperellipsoid corresponding to a constant Mahalanobis distance Δ is
2.22 () www Show that the inverse of a symmetric matrix is itself symmetric.
2.21 () Show that a real, symmetric matrix of size D×D has D(D+1)/2 independent parameters.
2.20 ( ) www A positive definite matrix Σ can be defined as one for which the quadratic form aTΣa (2.285)is positive for any real value of the vectora. Show that a necessary and sufficient
2.19 ( ) Show that a real, symmetric matrix Σ having the eigenvector equation (2.45)can be expressed as an expansion in the eigenvectors, with coefficients given by the eigenvalues, of the form
2.18 ( ) Consider a real, symmetric matrix Σ whose eigenvalue equation is given by (2.45). By taking the complex conjugate of this equation and subtracting the original equation, and then forming
2.17 () www Consider the multivariate Gaussian distribution given by (2.43). By writing the precision matrix (inverse covariance matrix) Σ−1 as the sum of a symmetric and an anti-symmetric
2.16 ( ) www Consider two random variables x1 and x2 having Gaussian distributions with means μ1, μ2 and precisions τ1, τ2 respectively. Derive an expression for the differential entropy of the
2.15 ( ) Show that the entropy of the multivariate Gaussian N(x|μ,Σ) is given by H[x] =1 2ln |Σ| + D 2(1 + ln(2π)) (2.283)where D is the dimensionality of x.
2.14 ( ) www This exercise demonstrates that the multivariate distribution with maximum entropy, for a given covariance, is a Gaussian. The entropy of a distribution p(x) is given by H[x] = −p(x)
2.13 ( ) Evaluate the Kullback-Leibler divergence (1.113) between two Gaussians p(x) = N(x|μ,Σ) and q(x) = N(x|m,L).
2.12 () The uniform distribution for a continuous variable x is defined by U(x|a,b) =1 b − a, a x b. (2.278)Verify that this distribution is normalized, and find expressions for its mean and
2.11 () www By expressing the expectation of ln μj under the Dirichlet distribution(2.38) as a derivative with respect to αj , show that E[ln μj] = ψ(αj) − ψ(α0) (2.276)where α0 is given
2.10 ( ) Using the property Γ(x + 1) = xΓ(x) of the gamma function, derive the following results for the mean, variance, and covariance of the Dirichlet distribution given by (2.38)E[μj] =
2.9 ( ) www . In this exercise, we prove the normalization of the Dirichlet distribution(2.38) using induction. We have already shown in Exercise 2.5 that the beta distribution, which is a special
2.8 () Consider two variables x and y with joint distribution p(x, y). Prove the following two results E[x] = Ey [Ex[x|y]] (2.270)var[x] = Ey [varx[x|y]] + vary [Ex[x|y]] . (2.271)Here Ex[x|y]
2.7 ( ) Consider a binomial random variable x given by (2.9), with prior distribution for μ given by the beta distribution (2.13), and suppose we have observed m occurrences of x = 1and l
2.6 () Make use of the result (2.265) to show that the mean, variance, and mode of the beta distribution (2.13) are given respectively by E[μ] = a a + b(2.267)var[μ] = ab(a + b)2(a + b +
2.5 ( ) www In this exercise, we prove that the beta distribution, given by (2.13), is correctly normalized, so that (2.14) holds. This is equivalent to showing that 1 0μa−1(1 − μ)b−1 dμ
2.4 ( ) Show that the mean of the binomial distribution is given by (2.11). To do this, differentiate both sides of the normalization condition (2.264) with respect to μ and then rearrange to obtain
2.3 ( ) www In this exercise, we prove that the binomial distribution (2.9) is normalized.First use the definition (2.10) of the number of combinations of m identical objects chosen from a total of N
2.2 ( ) The form of the Bernoulli distribution given by (2.2) is not symmetric between the two values of x. In some situations, it will be more convenient to use an equivalent formulation for which x
2.1 () www Verify that the Bernoulli distribution (2.2) satisfies the following properties1 x=0 p(x|μ) = 1 (2.257)E[x] = μ (2.258)var[x] = μ(1 − μ). (2.259)Show that the entropy H[x] of a
1.30 ( ) Evaluate the Kullback-Leibler divergence (1.113) between two Gaussians p(x) = N(x|μ, σ2) and q(x) = N(x|m, s2).
1.29 () www Consider an M-state discrete random variable x, and use Jensen’s inequality in the form (1.115) to show that the entropy of its distribution p(x) satisfies H[x] lnM.
1.28 () In Section 1.6, we introduced the idea of entropy h(x) as the information gained on observing the value of a random variable x having distribution p(x). We saw that, for independent
1.27 ( ) www Consider the expected loss for regression problems under the Lq loss function given by (1.91). Write down the condition that y(x) must satisfy in order to minimize E[Lq]. Show that,
1.26 () By expansion of the square in (1.151), derive a result analogous to (1.90) and hence show that the function y(x) that minimizes the expected squared loss for the case of a vector t of target
1.25 () www Consider the generalization of the squared loss function (1.87) for a single target variable t to the case of multiple target variables described by the vector t given by E[L(t, y(x))]
1.24 ( ) www Consider a classification problem in which the loss incurred when an input vector from class Ck is classified as belonging to class Cj is given by the loss matrix Lkj, and for which
1.23 () Derive the criterion for minimizing the expected loss when there is a general loss matrix and general prior probabilities for the classes.
1.22 () www Given a loss matrix with elements Lkj, the expected risk is minimized if, for each x, we choose the class that minimizes (1.81). Verify that, when the loss matrix is given by Lkj = 1 −
1.37 () Using the definition (1.111) together with the product rule of probability, prove the result (1.112).
1.31 ( ) www Consider two variables x and y having joint distribution p(x, y). Show that the differential entropy of this pair of variables satisfies H[x, y] H[x] + H[y] (1.152)with equality if,
1.41 () www Using the sum and product rules of probability, show that the mutual information I(x, y) satisfies the relation (1.121).
1.40 () By applying Jensen’s inequality (1.115) with f(x) = lnx, show that the arithmetic mean of a set of real numbers is never less than their geometrical mean.
1.39 ( ) Consider two binary variables x and y having the joint distribution given in Table 1.3.Evaluate the following quantities(a) H[x] (c) H[y|x] (e) H[x, y](b) H[y] (d) H[x|y] (f) I[x,
1.38 ( ) www Using proof by induction, show that the inequality (1.114) for convex functions implies the result (1.115).
1.36 () A strictly convex function is defined as one for which every chord lies above the function. Show that this is equivalent to the condition that the second derivative of the function be
1.35 () www Use the results (1.106) and (1.107) to show that the entropy of the univariate Gaussian (1.109) is given by (1.110).
1.34 ( ) www Use the calculus of variations to show that the stationary point of the functional (1.108) is given by (1.108). Then use the constraints (1.105), (1.106), and (1.107) to eliminate the
1.33 ( ) Suppose that the conditional entropy H[y|x] between two discrete random variables x and y is zero. Show that, for all values of x such that p(x) > 0, the variable y must be a function of
1.32 () Consider a vector x of continuous variables with distribution p(x) and corresponding entropy H[x]. Suppose that we make a nonsingular linear transformation of x to obtain a new variable y =
1.21 ( ) Consider two nonnegative numbers a andb, and show that, if a b, then a (ab)1/2. Use this result to show that, if the decision regions of a two-class classification problem are chosen to
1.20 ( ) www In this exercise, we explore the behaviour of the Gaussian distribution in high-dimensional spaces. Consider a Gaussian distribution in D dimensions given by p(x) =1(2πσ2)D/2
1.19 ( ) Consider a sphere of radius a in D-dimensions together with the concentric hypercube of side 2a, so that the sphere touches the hypercube at the centres of each of its sides. By using the
1.7 ( ) www In this exercise, we prove the normalization condition (1.48) for the univariate Gaussian. To do this consider, the integral I = ∞−∞exp− 1 2σ2 x2dx (1.124)which we can
1.6 () Show that if two variables x and y are independent, then their covariance is zero.
1.5 () Using the definition (1.38) show that var[f(x)] satisfies (1.39).
1.4 ( ) www Consider a probability density px(x) defined over a continuous variable x, and suppose that we make a nonlinear change of variable using x = g(y), so that the density transforms
1.3 ( ) Suppose that we have three coloured boxes r (red), b (blue), and g (green).Box r contains 3 apples, 4 oranges, and 3 limes, box b contains 1 apple, 1 orange, and 0 limes, and box g contains
1.2 () Write down the set of coupled linear equations, analogous to (1.122), satisfied by the coefficients wi which minimize the regularized sum-of-squares error function given by (1.4).
1.1 () www Consider the sum-of-squares error function given by (1.2) in which the function y(x,w) is given by the polynomial (1.1). Show that the coefficients w = {wi} that minimize this error
1.15 ( ) www In this exercise and the next, we explore how the number of independent parameters in a polynomial grows with the orderM of the polynomial and with the dimensionality D of the input
1.8 ( ) www By using a change of variables, verify that the univariate Gaussian distribution given by (1.46) satisfies (1.49). Next, by differentiating both sides of the normalization condition
1.9 () www Show that the mode (i.e. the maximum) of the Gaussian distribution(1.46) is given by μ. Similarly, show that the mode of the multivariate Gaussian(1.52) is given by μ.
1.10 () www Suppose that the two variables x and z are statistically independent.Show that the mean and variance of their sum satisfies E[x + z] = E[x] + E[z] (1.128)var[x + z] = var[x] + var[z].
1.17 ( ) www The gamma function is defined byΓ(x) ≡ ∞0 ux−1e−u du. (1.141)Using integration by parts, prove the relation Γ(x + 1) = xΓ(x). Show also thatΓ(1) = 1 and hence that Γ(x +
1.16 ( ) In Exercise 1.15, we proved the result (1.135) for the number of independent parameters in the Mth order term of a D-dimensional polynomial. We now find an expression for the total
1.14 ( ) Show that an arbitrary square matrix with elements wij can be written in the form wij = wS ij + wA ij where wS ij and wA ij are symmetric and anti-symmetric matrices, respectively,
1.13 () Suppose that the variance of a Gaussian is estimated using the result (1.56) but with the maximum likelihood estimate μML replaced with the true value μ of the mean. Show that this
1.12 ( ) www Using the results (1.49) and (1.50), show that E[xnxm] = μ2 + Inmσ2 (1.130)where xn and xm denote data points sampled from a Gaussian distribution with meanμ and variance σ2, and
1.11 () By setting the derivatives of the log likelihood function (1.54) with respect to μand σ2 equal to zero, verify the results (1.55) and (1.56).
1.18 ( ) www We can use the result (1.126) to derive an expression for the surface area SD, and the volume VD, of a sphere of unit radius in D dimensions. To do this, consider the following result,
Showing 200 - 300
of 269
1
2
3