Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

calibrated our predicted probabilitim are. Roughly speaking , we say f(m) is well calibrated if we look at all examples (3;,y) for which f(z)

image text in transcribed
image text in transcribed
"calibrated " our predicted probabilitim are. Roughly speaking , we say f(m) is well calibrated if we look at all examples (3;,y) for which f(z) 5:: 0.7 and we find that close to 70% of those examples have y = 1, as predicted and then we repeat that for all predicted probabilities in (0,1). To see how well-calibrated our predicted probabilities are, break the predictions on the validation set into groups based on the predicted probability (you can play with the size of the groups to get a result you think is informative). For each group, examine the percentage of positive labels. You can make a table or graph. Summarize the results. You may get some ideas and references from scikit -1earn's discussion 2 Bayesian Logistic Regression with Gaussian Priors Let's continue with logistic regression in the Bayesian setting, where we introduce a priorp(w) on w E Rd. 1. For the dataset D described in Section 1, give an expression for the posterior density p(w | D) in terms of the negative log-likelihood function NLL.D(w) and the prior density p(w) (up to a proportionality constant is ne ). 2. Suppose we take a prior on w of the form 111 ~ N (0,2). Is this a conjugate prior to the likelihood given by logistic regression '3 3. Find a covariance matrixZ such that MAP estimate form after observing data D is the same as the minimizer of the regularized logistic regression function defined in Section 1.3 (and prove it). [Hint: Consider minimizing the negative log posterior ofm. Also, remember you can drop any terms from the objective function that don't depend on to. You may freely use results of previous problems ] 4. In the Bayesian approach , the prior should reect your beliefs about the parameters before seeing the data and , in particular , should be independent on the eventual size of your dataset . Following this, you choose a prior distributionw ~ N(D,I). For a dataset I) of size n, how should you choose A in our regularized logistic regression objective function so that the minimizer is equal to the mode of the posterior distribution of at (Le. is equal to the MAP estimator ) . 3 Coin Flipping with Partial Observability Consider ipping a biased coin where p(z = H |61) = 0]. However, we cannot directly observe the result 2. Instead , someone reports the result to us, which we denotey byr. Further, there is a chance that the result is reported incorrectly if it's a head. Specically, we have p(m : H | z 2 H, 92) : 62 and p(m=Tiz=T)=1. 1. Show that p(:z: : H i61,62) : 61:92. 2. Given a set of reported results Dr of size NT, where the number of heads isnh and the number of tails is n2. Can we estimate 61 and 02 using MLE? Explain your judgment

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Entropy And Diversity The Axiomatic Approach

Authors: Tom Leinster

1st Edition

1108962173, 9781108962179

More Books

Students also viewed these Mathematics questions

Question

Explain the Pascals Law ?

Answered: 1 week ago

Question

What are the objectives of performance appraisal ?

Answered: 1 week ago

Question

State the uses of job description.

Answered: 1 week ago