Question

1 Approved Answer

Posted on Sep 24, 2024

Project Description The goals of this project are to: implement two types of probabilistic algorithms observe how expected results compare to observed results Develop and

Project Description The goals of this project are to: implement two types of probabilistic algorithms observe how expected results compare to observed results Develop and test a program that implements the bad microchip problem and a simulation based upon probability distributions. For the first part, the idea is that some batches of chips might not be tested. and the goal is to detect bad batches without testing all the chips in the batch. We will be simulating the process of sampling chips from a collection of batches of chips. Part 1a: Generate data sets. Automate creation of a user-specified number of datasets with a user-specified number of batches, batch size, percentage of the datasets containing bad chips, and percentage of bad chips in a dataset. When the program runs, it will read four (4) configuration files titled c1.txt, c2.txt, c3.txt, and c4.txt, containing specs for each run. These configuration files should have the following values written as integers, one per row: Generate a dataset from the input specification. The dataset will contain an individual file for each batch of items. Save each file in the dataset as ds1.txt, ds2.txt, ... , dsn.txt. To create an individual file, decide if it has bad items or not. Run a loop for the number of items in the batch. If it is a good batch, just write 'g' to the file (one per line) for the total number of items in the batch. If it is a bad batch, use a random number generator for the input-specified percentage of bad chips: Example - assume the spec is that 10% of chips are bad. Generate datasets by generating random numbers on [0..99] if 0 .. 9 comes up, add a bad chip (write the char 'b' to the file), otherwise, add a good chip to the data set (write a 'g' to the file). Part 1b: Create the Monte Carlo process to determine which of the chip batches are bad. It should know how many data sets there are, read them one at a time, sample the appropriate number of items, and report good batch or bad batch. Your output should look like the output below. Create a summary report detailing the Example output: Running: Number of batches of items: 100 Number of items in each batch 2000 Percentage of batches containing bad items 24% Percentage of items that are bad in a bad set 7% Items sampled from each set 30 Generating data sets: Create bad set batch # 4, totBad = 133 total = 2000 badpct = 7 Create bad set batch # 8, totBad = 145 total = 2000 badpct = 7 Create bad set batch # 12, totBad = 122 total = 2000 badpct = 7 Analyzing Data Sets: batch #0 is bad batch #4 is bad batch #12 is bad batch #16 is bad batch #20 is bad batch #24 is bad batch #28 is bad batch #32 is bad batch #36 is bad Base = 0.930000 exponent = 30 P(failure to detect bad item) = 0.113367 P(batch is good) = 0.886633 Percentage of bad batches detected = 88% Part 2: A Monte Carlo Simulation. Implement the process of converting historical data into the probability of occurrence of the various outcomes, extracting probability distributions, assigning random number intervals, and running a simulation. Assume we are dealing with bacteria counts in Escambia bay. Our data might be like the following for 100 days of observation: 100 7 0-2000/ml: 15 2000-4000/ml: 25 4000-8000/ml: 20 8000-12000/ml: 15 12000-18000/ml: 10 18000-24000/ml: 10 24000-28000/ml: 5 The goal will be to read the data, compute the probability distribution, carry out a simulation, and predict the expected bacterial reading. Data will be in a file entitled readings.txt. The file readings.txt will first have an integer indicating how many days of data are included followed by the number of categories of data. This second integer will be followed by the textual description of the range of values and the number of days on which that observation occurred. This part of the program should read these values, simulate a user-specified number of days (see below), and compute and display the expected value from the analytical model. Inputs and Outputs When your program runs, it will first automatically generate the data sets for part 1, carry out the simulation, and report results. Then, it will read the readings.txt for part 2. It will prompt the user for the number of days to simulate and then perform the simulation. It will also calculate and report the expected value of the analytical model. ALL results will go to the console. Note that different data (categories and percents) in readings.txt should work properly. Note: code must be in C language