Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

3. Suppose we observe N i.id data points D = {x,y,... In), where each 2 {1, 2, ...,K) is a random variable with categorical (discrete)

image text in transcribed
image text in transcribed
3. Suppose we observe N i.id data points D = {x,y,... In), where each 2 {1, 2, ...,K) is a random variable with categorical (discrete) distribution parameterized by 0 = (0., 8., ...,Ox), i.e., In Cat(0.02 ...,Ox), n=1,2,..., N (8) In detail, this distribution means that for a specific n, the random variable In follows P(in = k) = 0x, k=1,2,..., K. Equivalently, we can also write the density function of a categorical distribution BS plen) - LTO- where I . = k] is called identity function, and defined as 11. = 4) = { if - otherwise (10) 0, a. Now we want to prove that the joiniylistribution of multiple i.i.d categorical variables is a multinomial distribution. Show that the density function of D= {11, 12,., In} is p(D|) - II ON (11) where N = N1[In = k) is the number of random variables belonging to category k. In other word, D = {21, 12, ..., In} follows a multinomial distribution. b. We often call p(DIO) likelihood function, since it indicates the possibility we observe this dataset given the model parameters 6. By Bayes rule, we can rewrite the posterior as p( DpO) (12) P(D) where p() is piror distribution which indicates our preknowledge about the model parameters. And p(D) is the distribution of the observations (data), which is constant w.r.t. posterior. Thus we can write p(OD) p( DpO) (13) p(OD) If we assume the Dirichlet prior on i.e., K p(0:1, 42, ....ax) = Dir(6.a., 22.., ) 1 Bla (14) where Bla) is Beta function and a (Qi09) Now try to derive the joint distribution p(D, 2) and ignore the constant term w.r.t. a. Show that the posterior is actually also Dirichlet and parameterized as follows: p(OD) = Dir(0; 01 + N1,02 + N2, ..., QX + Nx) (15) [In fact, this nice property is called conjugacy in machine learning. A general statement is : If the prior distribution is conjuagate to the likelihood, then the posterior will be the same distribution as the prior distribution. Search conjugate prior and exponential family for more detail if you are interested.]

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Students also viewed these Databases questions

Question

3. What may be the goal of the team?

Answered: 1 week ago

Question

Name the R package used for implementing decision trees in R.

Answered: 1 week ago

Question

Explain the various collection policies in receivables management.

Answered: 1 week ago

Question

What are the main objectives of Inventory ?

Answered: 1 week ago

Question

Explain the various inventory management techniques in detail.

Answered: 1 week ago