Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

Problem 1. Consider the following mock data of a population of 1,000 registered voters: polling population.xlsx. The file contains information on voter gender and their

Problem 1. Consider the following mock data of a population of 1,000 registered voters: polling population.xlsx. The file contains information on voter gender and their vote - Candi- date 1 or Candidate 2. The population will be polled (i.e., sampled).

a)First, consider a single call. Let X be a random variable that indicates the gender:

0, if Male.

X=1, if Female,Let Ybe a random variable that indicates whether or not the voter votes for Candidate 1:

0, if vote for Candidate 2.

Y=1, if vote for Candidate 1,

First, use the Excel function COUNTIFS() to determine the joint probability distribution.

That is, determine the probabilities for each for the 4 possible realizations:

P ((X, Y ) = (1, 1)) =?

P ((X, Y ) = (1, 0)) =?

P ((X, Y ) = (0, 1)) =?

P ((X, Y ) = (0, 0)) =?

b)Determine P (X = 1) and then use this probability and the Excel function RAND() to simulate a single realization of X. That is, simulate the voter gender for a single call. Report this single realization.

c)

100

Now, simulate 100 calls, using 100 rows in Excel with 100 corresponding RAND() real- izations.(Sampling is with replacement.)Wemayuse the index i for the separate calls:X1, X2, . . . , X100 Note that the sum of these variables, X1 + X2 + X100, equals the total number of woman called, because males are encoded as 0 and don't contribute to the sum. Then, the fraction of the 100 calls who are women is:X1+X2++X100.For your simulation, report this fraction of those called who are women.

d)) Similarly, simulate Y1, Y2, . . . , Y100, corresponding to 100 calls (where again Y indicates whether there is a vote for Candidate 1).Report the single quantity Y1 + Y2 + + Y100.

Also, in words, what is the interpretation of this quantity?

e)It is important to note that we first used the population data to extract the probabilities of interest, like the probability P (Y = 1). In practice, we may want to know this probability since knowing it lets us know whether Candidate 1 will win the election. However, in practice we do not have access to data on the entire population. Instead we estimatethe population statistics through sampling. Towards this goal, we will first learn about the probabilistic properties of this sampling process. In this way, we can understand the probability distribution of the sample mean:

Y1 + Y2 + Yn , n

where n is the number of calls.

In words, how is this fraction (the sample mean) useful for estimating P (Y = 1)?

Let's (continue to) pretend that we are all-knowing,like an "oracle," and somehow knowP (Y= 1).UseP (Y= 1)calculatedfromthepopulationdataasabovetonowsimulate n calls and report the sample mean:

Y1 + Y2 + Yn , n

for n = 10, 100, and 1000. The result is three different values that are all somewhat close to P (Y = 1). Hit F9 in Excel several times. What do you notice? Of the 3 results, for n = 10, 100, and 1000, which has the least variability and which has the most variability? Explain why this makes sense.

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image

Step: 3

blur-text-image

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

A Survey of Mathematics with Applications

Authors: Allen R. Angel, Christine D. Abbott, Dennis Runde

10th edition

134112105, 134112342, 9780134112343, 9780134112268, 134112261, 978-0134112107

More Books

Students also viewed these Mathematics questions