Question

1 Approved Answer

Posted on Oct 14, 2024

LAB ACTIVITY #2 Show your solution to EVERYTHING asked IN RED in Problems 1 - 4 below. Remember to include the relevant R code and

LAB ACTIVITY #2 Show your solution to EVERYTHING asked IN RED in Problems 1 - 4 below. Remember to include the relevant R code and output. Part 1 (Continuous Distributions in R) In this part we will practice finding probabilities and percentiles for Uniform, Normal, and Exponential random variables using R. Every continuous distribution in R has a root name. For example, for the Uniform distribution the root name is unif, for Normal - norm, and for Exponential - exp. The following letterprefixes of the root are used to generate respective functions for the distributions: d p q r for "density", the density function (pdf) (note: d prefix is not needed for the purposes of this lab!) for "probability", the cumulative distribution function (this is the cdf) for "quantile", the inverse cdf (this allows to find percentiles) for "random", generates a random variable that has the specified distribution For a random variable that has Uniform distribution on interval from to punif(x, a, b) # (), the cdf at argument of Unif(, ) random variable qunif(p, a, b) # , or the 100(1 )th percentile of Unif(, ) rv, where = 1 runif(m, a, b) # generates random values from Unif(, ) distribution Similarly, for a Normal random variable with parameters and 2 pnorm(x, mu, sigma) # (),, the cdf at argument x of (, 2 ) random variable qnorm(p, mu, sigma) # , or the 100(1 )th percentile of (, 2 ), where = 1 rnorm(m, mu, sigma) # generates random values from (, 2 ) distribution Note: In R you do not need to standardize normal random variables! You can work with any normal variable directly as long as you specify its mean and standard deviation . For an Exponential random variable with parameter pexp(x, lambda) # (),, the cdf at argument x of Exp() random variable qexp(p, lambda) # , or the 100(1 )th percentile of Exp(), where = 1 rexp(m, lambda) # generates random values from Exp() distribution Example a) Let have Uniform distribution on [10.3, 15.6]. Find ( < 12| 3.5). Also generate 5 values from this distribution. b) Let ~ (30.6, 13.8). Find ( > 28) and 0.85 . Also generate 10 values from this distribution. Problem 1 The scores of a reference population on the Wechsler Intelligence Scale for Children (WISC) are normally distributed with mean 100 and variance 225. Find the following: a) the probability that a randomly selected child has a score of 112 or more b) the probability that a child scores within 2.3 standard deviations from the mean c) the probability a randomly selected child has a score of at least 105, given the score is below 135 d) the score that separates a child from the top 2% of the population from the bottom 98%. Provide the percentile name for this score and the notation. e) the 30th percentile of the scores (also provide the notation for this value) f) the median score Problem 2 Busses arrive at a certain stop at 15-minute intervals starting 7:00 am. Suppose a passenger comes to the stop at a time uniformly distributed between 7 and 7:30 am. Find the following: a) the probability that he waits for a bus less than 4 minutes b) the probability that he waits for a bus more than 9 minutes Problem 3 Suppose messages arrive to a computer server according to Poisson process with the rate of 15 per hour. Find the following: a) the probability that there are no messages in the next five minutes b) the probability that there are no messages in the next four minutes, if there were no messages in the previous 5 minutes c) the probability that it will take between 1 and 4 minutes for the 11th message to arrive after the arrival of the 10th message d) the average time till the next message arrives e) the median of the time till the next message arrives Part 2 (Normal Approximation to Binomial) Recall that under certain conditions probabilities for a Binomial random variable can be approximated using Normal distribution. In this part of the lab activity let us consider how well the Normal approximation to Binomial works. The R functions for Binomial distributions were discussed in Lab activity #1, and the R functions for Normal distributions are discussed in Part 1 of this lab. Problem 4 Suppose that 44% of all drivers stop at an intersection having flashing red lights when no other cars are visible. Of 360 randomly selected drivers coming to an intersection under these conditions, let be the number of those who stop. Suppose we are interested in finding (105 165). a) What is the exact distribution of ? Remember to provide the name of the exact distribution and the value(s) of the parameter(s) b) Use the exact distribution of to compute the exact probability (105 165) in R Now let's see how well the Normal approximation to the Binomial works. c) Check the appropriate condition(s) for the approximation. d) Compute the mean and the standard deviation of the approximate distribution of . e) Now calculate the approximate probability (105 165) using R (remember to use the continuity correction factor of 0.5!) Finally, let's compare the approximate probability to the exact one above f) Comment on how well/poorly the Normal approximation to the Binomial works here, i.e. compare the result of e) to the exact (true) probability value from part b). Let's also see if there is any benefit from using the continuity correction in the approximation process. g) Perform the approximation of (105 165) without the continuity correction step, i.e. apply the normal approximation to \"uncorrected\" original probability. And compare the result to the exact value from part b). Comment on whether the use of continuity correction improves the Normal approximation to Binomial. Part 3 (Percentiles of Discrete Random Variables) Recall that the 100(1 )th percentile of a random variable is the value such that ( ) = 1 . However, also recall that for a discrete random variable its cdf () is a step function, and therefore it only takes on certain values. Therefore, for those 1 that are values that the cdf () takes, we cannot uniquely determine the value of the argument (because the whole \"step\" corresponds to this value of cdf). See part a) of Problem 3 for this. Moreover, there are many values of 1 that the cdf doesn't take. See parts b) and c) of Problem 3 below. (Note that we do not encounter this problem with continuous random variables, because for any continuous random variable for any 0 < 1 < 1 there exists a value such that () = 1 .) To get a better understanding consider the following problem. Problem 5 [BONUS] Let ~ Binom(18, 0.4). Let's study the cdf values of this random variable. First, we create them using the following code: x = seq(0,18,by=1) # create the vector of possible values of the random variable cdf = pbinom(x,18,0.4) # create the cdf values corresponding to the values in vector x and records them in vector cdf (you can call the vector whatever you want, just use the name consistently below) Let's take a look that the created cdf of listed with the x values cbind(x,cdf) # creates a matrix with vectors x and cdf as columns 1. Observe that, for example, 0.032781297 is (3). But then 0.032781297 = () for any 3 < 4 (that is, the cdf value for this entire \"step\" is 0.032781297). So, which of the values 3 < 4 is considered the 1000.032781297th percentile of this discrete random variable ? The R command to determine percentiles is qbinom(beta, n, p) # finds the (100)th percentile of random variable X ~ Binom(n, p), where 0 < < 1; that is, it finds such that () = . Thus, to find the 1000.032781297th percentile of , run the following and observe the result qbinom(0.032781297, 18, 0.4) Note: For a discrete random variable , when = 1 is such that for some from the support we have () = 1 , the respective R command reports this . 2. Suppose we want to find the 90th percentile of , that is the value 0.1 such that (0.1 ) = 0.9. Look at the cdf values (see cbind(x,cdf) above) and note that none of the values is equal to 0.9. More precisely, here (9) < 0.9 < (10). a) report the cdf values (9) and (10) Let's also illustrate this situation with a plot plot(x,cdf) # plot the cdf values (y-axis) vs x values (x-axis) Now add the line at the level 0.9 to the cdf plot above using the code below: abline(0.9, 0, col = 2) # adds the line with slope = 0.9 (or = 0.9 + 0); the first and second parameters are the intercept and the slope of the line, respectively, and the third parameter col allows to change the color of the added line (2 or \"red\" are for red, for example) b) include the cdf plot with the added line at 0.9 level The plot further illustrates that the cdf is not equal to 0.9 for any value of this random variable . Now let's employ the respective R command (look above for qbinom) to find the 90th percentile: c) use the R command to determine the 90th percentile of and report the R command and the output for the percentile Note: For a discrete , when 1 is such that there is no such that () = 1 , the respective R command reports the value such that ( 1) < 1 < (). That is, R reports the smallest for which () > 1 . Nothing more can be done. 3. Similarly to the ideas described in part 2. determine the median of . d) is there a value such that () = 0.5 (exactly)? And if yes, report it e) determine the value such that ( 1) < 1 < () and report the cdf values ( 1) and () for this f) include the cdf plot with the added line at 0.5 level g) use the appropriate percentile command in R to determine the median of and report the value