Answered step by step
Verified Expert Solution
Link Copied!

Question

1 Approved Answer

The birthday problem is a famous problem with an interesting result. It is stated as follows: in a set of n people with randomly distributed

The birthday problem is a famous problem with an interesting result. It is stated as follows: "in a set of n people with randomly distributed birthdays, what is the probability that some pair of them will have the same birthday?" Consider a group of n people in a room. We denote the birthday of the ith person as bi{1,,365}bi{1,,365}. We want to measure the probability that in this room of n people at least 2 people have the same birthday. We denote this by the quantity:

image text in transcribed

Part 1

Write a code snippet that computes the probability of repeated birthdays in a room of nn people. Your code must:

  • Define the function duplicate_birthdays as follows:
    def duplicate_birthdays(n): # Generate 1000 simulations of room of size n # Determine if the room has duplicate birthdays # Returns the number of rooms with duplicate birthdays return 0 
  • Note: Your code will have access to a function genroom that you should use to generate a room with a random sample of nn birthdays. This function is defined before your code. It is shown here in case you wish to program offline, but you must not write the function again inside the code box below.
    def genroom(n): return np.random.randint(1,366, n) 
  • For a particular nn, think about what approximate value we should compute for PnPn if kk simulations out of 1000 simulations had a duplicate birthday. Compute this approximate probability forn{2,,100}n{2,,100} and store the values of PnPn in an array named prob_n.

Your code snippet must look like:

for n in range(2, 101): # call function duplicate_birthdays(n) # update the array prob_n 

Important things to consider:

  • We understand there are other ways to perform 1000 simulations of nn birthdays. However, the auto-grader will assume that you first loop over the nn samples, and for each sample you generate 1000 simulations (as indicated in the pseudo-code above). You must not invert this order, otherwise the auto-grader will mark your answer as incorrect.
  • The random number generator is seeded (in the initial code given in the box below). Do not change how the code is seeded or it will override this. Do not use np.vectorize or it will result in a different random number distribution.
  • There are many ways to check for duplicates, but some are more efficient than others. Comparing every person to every other person would be O(n2)O(n2) work, and may not be possible to calculate for large nn. Can you think of a more efficient way of checking for duplicates?

Part 2

Use the computed probabilities prob_n to determine the minimum number of people needed to make the probability greater than 50%. Store this in the variable perc_50.

Part 3

Plot the computed probabilities for n{2,,100}n{2,,100}. Make sure to label your plot appropriately.

Output

Your code snippet should define the following variables:

Name Type Description
duplicate_birthdays function generates 1000 simulations of rooms, returns the number of rooms with duplicate birthdays
prob_n numpy array array giving the computed probability for n{2,,100}n{2,,100}
perc_50 integer the minimum nn for which Pn>0.5Pn>0.5 based on the variable prob_n

In addition, your code must provide a plot of probability PnPn vs nn. Make sure to include appropriate axes labels and a title.

// GIVEN CODE

import numpy as np import matplotlib import matplotlib.pyplot as plt

def duplicate_birthdays(n): # Generate 1000 simulations of rooms (with the birthdays of n people in each room) # Compute the number of rooms with duplicate birthdays. # Returns the number of rooms with duplicate birthdays return 0

# Part 1 for n in range(2, 101): # call function duplicate_birthdays(n) # update the array prob_n

# Part 2 # Estimate perc_50

# Part 3 # Plot prob_n

please help!!

the same birthday. We denote this by the quantity: The subscript n denotes that this value is dependent on n. For example, P2 1/365 while P366 1. Think about why this should be the case.) We will try to infer the behavior of Pn as a function of n numerically, by generating many uniformly distributed samples of birthdays and checking whether any pair of birthdays is the same. For the purposes of this problem, assume there are 365 days in a year (ignore leap years) the same birthday. We denote this by the quantity: The subscript n denotes that this value is dependent on n. For example, P2 1/365 while P366 1. Think about why this should be the case.) We will try to infer the behavior of Pn as a function of n numerically, by generating many uniformly distributed samples of birthdays and checking whether any pair of birthdays is the same. For the purposes of this problem, assume there are 365 days in a year (ignore leap years)

Step by Step Solution

There are 3 Steps involved in it

Step: 1

blur-text-image

Get Instant Access to Expert-Tailored Solutions

See step-by-step solutions with expert insights and AI powered tools for academic success

Step: 2

blur-text-image_2

Step: 3

blur-text-image_3

Ace Your Homework with AI

Get the answers you need in no time with our AI-driven, step-by-step assistance

Get Started

Recommended Textbook for

Database Development For Dummies

Authors: Allen G. Taylor

1st Edition

978-0764507526

More Books

Students also viewed these Databases questions