Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Consider the problem of classifying D dimensional inputs x R D . Suppose we have 2 classes and the output is denoted y { 0

Consider the problem of classifyingD dimensional inputs xRD. Suppose we have 2 classes and the output is denoted y{0,1}. If we use a binary logistic regression classifier then the model is:

pb(yx)=Binomial(y(wTx+b))

where(a)=1+ea1 and wRD,bR are the parameters of the model.

Multiclass logistic regression can also be used when there are only two classes. In that case, the model is:

pm(yx)=Categoricial(yS(Ax+c))

whereS(Ax+c)=iexp(Axi+c)exp(Ax+c) is the softmax function andAR2d,cR2 are the parameters of the model.

  1. Prove that this multiclass logistic regression model is equivalent to the binary logistic regression model. In particular show how, given the parametersw,b of any binary logistic regression model, you could construct parameters A,cof a multiclass logistic regression model which would always give the same predictions.
  2. Also, show how, given A,c, you can compute parametersw,b which would always give the same predictions.
  3. Finally, are these transformations unique? Given the valuesA,c (or w,b), is the value ofw,b (or A,c) unique? If a direction is unique, give an argument why. If it's not unique, give at least one example of a different transformation which would be equivalent.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Discrete Structures, Logic, And Computability

Authors: James L Hein

3rd Edition

1449615279, 9781449615277

More Books

Students also viewed these Mathematics questions

Question

MHR is separately computed for each machine.

Answered: 1 week ago