Answered step by step
Verified Expert Solution
Link Copied!

Question

...
1 Approved Answer

For each statement below, say whether it's true or false; if true without further assumptions, briefly explain why it's true (and ? extra credit ?

image text in transcribedimage text in transcribed

For each statement below, say whether it's true or false; if true without further assumptions, briefly explain why it's true (and ? extra credit ? what its implications are for statistical inference); if it's sometimes true, give the extra conditions necessary to make it true; if it's false, briefly explain how to change it so that it's true and/or give an example of why it's false. If the statement consists of two or more sub-statements and two or more of them are false, you need to explicitly address all of the false sub-statements in your answer.

(A) You're about to spin a roulette wheel, which will result in a metal ball landing in one of 38 slots numbered ? = {0,00,1,2,...,36}; 18 of the numbers from 1 to 36 are colored red, 18 are black, and 0 and 00 are green. You regard this wheel-spinning as fair, by which You mean that all 38 elemental outcomes in ? are equipossible. Under Your assumption of fairness, the classical (Pascal-Fermat) probability of getting a red number on the next spin exists, is unique, and equals 18/38.

(B) Under the same conditions as (A), the Kolmogorov (frequentist) probability of getting a red number on the next spin exists, is unique, and equals 18/38.

(C) Repeat (A) and (B) but removing the assumption that the wheel-spinning is fair, and not replacing it with any other assumption about the nature of the data-generating process (taking the outcomes of the wheel spins as data).

IID

(E) In learning how to do a good job on the task of uncertainty quantification, it's good to know quite a bit about both the Bayesian and frequentist paradigms, because (a) the Bayesian approach to probability ensures logical internal consistency of Your uncertainty assessments but does not guarantee good calibration, and (b) the frequentist approach to probability provides a natural framework in which to see if Your Bayesian answer is well-calibrated.

(D) In the Bernoulli sampling model, in which (Y1, . . . , Yn | ? B) ? Bernoulli(?), the sum sn =?ni=1 yi of the observed data values y = (y1, . . . , yn) is sufficient for inference about ?, and this means that in this model You can throw away the data vector y and focus only on snwithout any loss of information whatsoever.

2

(F)

(G)

(H)

(I)

(J)

(K)

(L)

2

(A)

TheBeta(?|?,?)parametricfamilyofdistributionsisusefulasasourceofpriordistributions when the sampling model is as in (D), because all distributional shapes (symmetric, skewed, multimodal, ...) on (0, 1) are realizable in this family.

Specifying the ingredients {p(?|B),p(D|?B),(A|B),U(a,?|B)} in Your model for Your uncertainty about an unknown ? (in light of background information B and data D) is typically easy, because in any given problem there will typically be one and only one way to specify each of these ingredients; an example is the Bernoulli sampling distribution p(D | ? B) arising uniquely, under exchangeability, from de Finetti's Theorem for binary outcomes.

In trying to construct a good uncertainty assessment of the form P(A|B), where A is a proposition and B is a proposition of the form (B1 and B2 and ... and Bk), You should try hard not to condition on any propositions Bi that are false, because that would be the probabilistic equivalent of dividing by zero.

The kind of objectivity in probability assessment sought by people like Venn, in which all reasonable people would agree on the assessed value, is often impossible to achieve, because all such assessments are conditional on the (1) assumptions, (2) judgments and (3) background information of the person making the probability assessment, and different reasonable people can differ along any of those three dimensions.

When making a decision in the face of uncertainty about an unknown ?, after specifying Your action space (A|B) and utility function U(a,?|B) and agreeing on the convention that large utility values are to be preferred over small ones, the optimal decision is found by maximizing U(a,?|B) over all a ? (A|B).

One reason that Bayesian inference was not widely used in the early part of the 20th cen- tury was that approximating the (potentially high-dimensional) integrals arising from this approach was difficult in an era when computing was slow and the Laplace-approximation technique had been forgotten.

Jaynes (2003, pp. 21-22) makes a useful distinction between {reality} (epistemology) and{Your current information about reality} (ontology); this distinction is useful in probabilis- tic modeling because {the world} does not necessarily change every time {Your state of knowledge about the world} changes.

Expert Answer

Anonymous's Avatar

Anonymous answered thisWas this answer helpful?

Thumbs up inactive2Thumbs down inactive0

5 answers

A) TRUE

Probability of getting a red slot is 18/38 as there are 18 red slots and 38 total possible outcomes.

B) TRUE

Weak Law of Kolmogrov converges in Probability. SoProbability of getting a red slot is 18/38 as there are 18 red slots and 38 total possible outcomes.

C)FALSE

Since wheel is no more unbiased, The outcome will depend on the biasness of the wheel.

image text in transcribedimage text in transcribedimage text in transcribed
Question 1 Bayesian Inference (40 credits) Let X be a random variable representing the outcome of a biased coin with possible outcomes .X' = (0, 1), a EX. The bias of the coin is itself controlled by a random variable O, with outcomes de d, where 8 =(0 ER : 05+51} The two random variables are related by the following conditional probability distribution function of X given 6. P(X =1|0 =#) =0 p(X =0|0 =0) =1-8 We can use p(X = 1 | 8) as a shorthand for p(X = 1 | 0 = 0). We wish to learn what e is, based on experiments by flipping the coin. Before we flip the coin, we choose as our prior distribution p(8) = 60(1 - #) which, when plotted, looks like this: 2 (0.5,1.5) 1.5 P(8) 1 0.5 0.2 0.4 0.6 0.8 For example, asserting that fo 13 (2) + 2x) de - 2/2 with no working out is adequate, as you could just plug the Integral into Wolfram Alpha using the command Integrate [2 2(x 3 + 2x) , (x, 0,1]] "For example, a value of 8 - 1 represents a coin with 1 on both sides. A value of e - 0 repressts a coin with 0 on both sides, and e - 1/2 represents a fair, unbalsed coin. a) (3 credits) Verify that p() = 60(1 -8) is a valid probability distribution on (0, 1] (Le that it is always non-negative and that it is normalised.) We flip the coin a number of times." After each coin flip, we update the probability distribution for to reflect our new belief of the distribution on 8, based on evidence. Suppose we flip the coin twice, and obtain the sequence of coin flips * 212 = 00. For each subsequence TI, T12 (and for the case before any coins are flipped), compute the: b) (15 credits) probability distribution functions c) (3 credits) expectation values a d) (3 credits) variances a" e) (5 credits) The marimum a posteriori estimation MAP. Present your results in a table like as shown below. Posterior PDF MAP p(@) 60(1 - 8) P(0 =1 = 0) ? P(8 212 = 00) () (5 credits) Plot each of the probability distributions p(8), p(|21 = 0), p(8121:2 = 00) over the interval 0

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access with AI-Powered Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Advanced Accounting

Authors: Paul M. Fischer, William J. Tayler, Rita H. Cheng

11th edition

978-0538480284

Students also viewed these Mathematics questions