Answered step by step
Verified Expert Solution
Question
1 Approved Answer
title: Homework 5 output: pdf_document --begin{center} textit{Due May 13, 2016 before 11:59 PM via upload to CCLE} end{center} section{Swirl Package} textbf{Problem 1.} - Start swirl
title: "Homework 5" output: pdf_document --\\begin{center} \\textit{Due May 13, 2016 before 11:59 PM via upload to CCLE} \\end{center} \\section{Swirl Package} \\textbf{Problem 1.} - Start swirl and complete the lessons using: **swirl()** - Do the following lessons: a) Do **10: simulation** lesson. What command was used to create my_pois variable? Explain what that command does. b) Do **15: Base Graphics** lesson. What command was used to plot cars with a subtitle of `My Plot Subtitle`? \\section{Scatter Plot} \\textbf{Problem 2.} a) Construct a scatter plot matrix using the built in dataset swiss and give it a title. b) Construct a scatter plot of Examination (x) and Ferility (y) using swiss. Add vertical and horizontal lines to represent the mean x and the y values to the graph. Appropriately title and label the graph. \\section{Line Plot} \\textbf{Problem 3.} Using the built in dataset airquality, please reproduce the following plot, which shows the temperature for each month on each day. \\section{Simulation} \\textbf{Problem 4.} Do not use a for loop for any of these questions. You must show your code in order to get credit. a) Let $U_1$,$U_2$,$U_3$ all come from a uniform(0,1) distribution. Let $M=max(U1,U2,U3)$. Estimate the probability $\\mathbb{P}(M > 0.75)$ using monte carlo simulation. b) Let $X_1,\\ldots,X_n$ be Poisson random variables with parameter $\\lambda=0.5$ where $n=21$. Estimate the probability that the sample mean is greater than the sample median using monte carlo simulation. c) Let $X_1,\\ldots,X_{100}$ be exponential random variables with rate $\\lambda=2$. Plot the histogram of $\\bar{X}$ and overlay normal density to the histogram. \\textbf{Problem 5} Write a function pi3(n) that approximates $\\pi$ as a function of n, by simulating random points in the square with vertices (-1,-1), (-1,1), (1,1), and (1,-1), seeing what fraction of them are in the unit circle [the circle with radius 1 centered at the origin], and then converting this fraction into an estimate of $\\pi$. Evaluate pi3(10^j) for $j = 0,1,2\\ldots,6$. For $j=6$, plot your simulated points, using different plotting symbols and colors for simulated points inside and outside the unit circle. Don't forget to use set.seed. (To get square axes use par(pty="s")). Homework 5 Due May 13, 2016 before 11:59 PM via upload to CCLE Swirl Package Problem 1. Start swirl and complete the lessons using: swirl() Do the following lessons: a) Do 10: simulation lesson. What command was used to create my_pois variable? Explain what that command does. b) Do 15: Base Graphics lesson. What command was used to plot cars with a subtitle of My Plot Subtitle? Scatter Plot Problem 2. a) Construct a scatter plot matrix using the built in dataset swiss and give it a title. b) Construct a scatter plot of Examination (x) and Ferility (y) using swiss. Add vertical and horizontal lines to represent the mean x and the y values to the graph. Appropriately title and label the graph. Line Plot Problem 3. Using the built in dataset airquality, please reproduce the following plot, which shows the temperature for each month on each day. Simulation Problem 4. Do not use a for loop for any of these questions. You must show your code in order to get credit. a) Let U1 ,U2 ,U3 all come from a uniform(0,1) distribution. Let M = max(U 1, U 2, U 3). Estimate the probability P(M > 0.75) using monte carlo simulation. 1 # Load data manipulation package suppressPackageStartupMessages(library(dplyr)) # Set option to display numbers in decimal notation options(scipen = 999) # Define the number of simulations, n n <- 5000 # Simulate the random variables U1, U2 and U3 set.seed(1) # To enanable the simulated numbers to be reproducible u1 <- runif(n) # Randomly draw from Uniform(0,1) set.seed(2) # To enanable the simulated numbers to be reproducible u2 <- runif(n) # Randomly draw from Uniform(0,1) set.seed(3) # To enanable the simulated numbers to be reproducible u3 <- runif(n) # Randomly draw from Uniform(0,1) # Combine results into data.frame d <- data.frame(u1, u2, u3) # # # # # Compute M and check if M is greater than 0.75 To compute M, we start by comparing u1 and u2 If u1 is greater than u2, then M = max(u1, u3). Otherwise, M = max(u2, u3). For the Isgreater column, we use the following logic: d <- d %>% mutate(M = ifelse(u1 > u2, ifelse(u1 > u3, u1, u3), ifelse(u2 > u3, u2, u3)), Isgreater = ifelse(M > 0.75, 1, 0)) # Display a sample of simulated data knitr::kable(head(d, 10), caption = 'Sample of Simulated Data') Table 1: Sample of Simulated Data u1 u2 u3 M Isgreater 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819 0.8983897 0.9446753 0.6607978 0.6291140 0.0617863 0.1848823 0.7023740 0.5733263 0.1680519 0.9438393 0.9434750 0.1291590 0.8334488 0.4680185 0.5499837 0.1680415 0.8075164 0.3849424 0.3277343 0.6021007 0.6043941 0.1246334 0.2946009 0.5776099 0.6309793 0.2655087 0.8075164 0.5733263 0.9082078 0.9438393 0.9434750 0.9446753 0.8334488 0.6291140 0.6309793 0 1 0 1 1 1 1 1 0 0 2 # Calculate required probability pr <- sum(d$M)/nrow(d) # Display probability pr ## [1] 0.7529793 b) Let X1 , . . . , Xn be Poisson random variables with parameter = 0.5 where n = 21. Estimate the probability that the sample mean is greater than the sample median using monte carlo simulation. # Ascribe input values lambda <- 0.5 sample.size <- 21 nSim <- 5000 # number of simulations # Create function to simulate whether mean is greater than median simulate <- function (l = lambda, s.size = sample.size) { s <- rpois(s.size, l) # randomly draw a sample from Poisson distribution mea <- mean(s) # Find the mean med <- median(s) # Find the median answer <- ifelse(mea > med, 1, 0) # Check if mean is bigger than median. return(answer) } # Run the simulation several times (nSim times) sim <- mapply(FUN = simulate, MoreArgs = NULL,1:nSim) # Calculate the required probability pr <- sum(sim)/ n # Display the result pr ## [1] 0.4918 and overlay c) Let X1 , . . . , X100 be exponential random variables with rate = 2. Plot the histogram of X normal density to the histogram. # Load plotting package suppressPackageStartupMessages(library(ggplot2)) # Randomly draw 100 observations from Exp(2) distribution and save in d d <- data.frame(x = rexp(n = 100, rate = 2)) # Overlay histogram, empirical density and normal density ggplot(d, aes(x)) + geom_histogram(aes(y = ..density..)) + stat_function(fun = dnorm, args = list(mean = mean(d$x), sd = sd(d$x)), lwd = 1, col = 'blue') + 3 theme_bw() + ggtitle('Histogram of Exponentially Distributed Random Variable \ with Normal Density Overlay') Histogram of Exponentially Distributed Random Variable with Normal Density Overlay density 2 1 0 0.0 0.5 1.0 1.5 2.0 x Problem 5 Write a function pi3(n) that approximates as a function of n, by simulating random points in the square with vertices (-1,-1), (-1,1), (1,1), and (1,-1), seeing what fraction of them are in the unit circle [the circle with radius 1 centered at the origin], and then converting this fraction into an estimate of . Evaluate pi3(10j) for j = 0, 1, 2 . . . , 6. For j = 6, plot your simulated points, using different plotting symbols and colors for simulated points inside and outside the unit circle. Don't forget to use set.seed. (To get square axes use par(pty=\"s\")). # Create pythagorus theorem function to compute distance from origin getDistFromOrigin <- function (x, y) { radius.squared <- (abs(x))^2 + (abs(y))^2 # Pythagorus theorem return(sqrt(radius.squared)) } # Create the function to simulate n points inside the square with # vertices (-1,-1), (-1,1), (1,1), and (1,-1). simulatePoints <- function (n) { # Simulate x coordinates 4 set.seed(1) x <- runif(n, -1, 1) # Simulate y coordinates set.seed(2) y <- runif(n, -1, 1) # Calculate distance from origin (dfo) dfo <- getDistFromOrigin(x, y) # Determine if simulated point is in circle (isincircle) isincircle <- ifelse(dfo <= 1, 1, 0) # Save results in a data frame d <- data.frame(x, y, dfo, isincircle) return(d) } # Create function pi3(n) to estimate pi pi3 <- function(n) { # Simulate the points d <-simulatePoints(n) # Count number of points in circle nIncircle <- sum(d$isincircle) # Calculate the fraction of points in circle portion <- nIncircle / n # Calculate pi pi = 4 * portion return(pi) } # Evaluate pi3(10^j) for J = 0,1, ..., 6 j <- 0:6 n <- 10^j piEstimate <- mapply(FUN = pi3, MoreArgs = NULL, n) #Summarise results data <- data.frame(j, n, piEstimate) knitr::kable(data) j n piEstimate 0 1 2 3 4 5 6 1 10 100 1000 10000 100000 1000000 4.000000 2.400000 3.280000 3.076000 3.124400 3.132920 3.142388 5 # Plot for j = 6 d1 <- simulatePoints(10^6) ggplot(data = d1, aes(x, y, color = factor(isincircle))) + geom_point(size = 1) + scale_color_manual("Is in Circle?\ ", labels = c("No", "Yes"), values = c("blue", "red")) + coord_fixed() + theme_bw() + ggtitle('Plot of Simulated Points') Plot of Simulated Points 1.0 0.5 y Is in Circle? 0.0 No Yes 0.5 1.0 1.0 0.5 0.0 0.5 x 6 1.0 --title: "Homework 5" output: pdf_document --\\begin{center} \\textit{Due May 13, 2016 before 11:59 PM via upload to CCLE} \\end{center} \\section{Swirl Package} \\textbf{Problem 1.} - Start swirl and complete the lessons using: **swirl()** - Do the following lessons: a) Do **10: simulation** lesson. What command was used to create my_pois variable? Explain what that command does. b) Do **15: Base Graphics** lesson. What command was used to plot cars with a subtitle of `My Plot Subtitle`? \\section{Scatter Plot} \\textbf{Problem 2.} a) Construct a scatter plot matrix using the built in dataset swiss and give it a title. b) Construct a scatter plot of Examination (x) and Ferility (y) using swiss. Add vertical and horizontal lines to represent the mean x and the y values to the graph. Appropriately title and label the graph. \\section{Line Plot} \\textbf{Problem 3.} Using the built in dataset airquality, please reproduce the following plot, which shows the temperature for each month on each day. \\section{Simulation} \\textbf{Problem 4.} Do not use a for loop for any of these questions. You must show your code in order to get credit. a) Let $U_1$,$U_2$,$U_3$ all come from a uniform(0,1) distribution. Let $M=max(U1,U2,U3)$. Estimate the probability $\\mathbb{P}(M > 0.75)$ using monte carlo simulation. ```{r} # Load data manipulation package suppressPackageStartupMessages(library(dplyr)) # Set option to display numbers in decimal notation options(scipen = 999) # Define the number of simulations, n n <- 5000 # Simulate the random variables U1, U2 and U3 set.seed(1) # To enanable the simulated numbers to be reproducible u1 <- runif(n) # Randomly draw from Uniform(0,1) set.seed(2) # To enanable the simulated numbers to be reproducible u2 <- runif(n) # Randomly draw from Uniform(0,1) set.seed(3) # To enanable the simulated numbers to be reproducible u3 <- runif(n) # Randomly draw from Uniform(0,1) # Combine results into data.frame d <- data.frame(u1, u2, u3) # # # # # Compute M and check if M is greater than 0.75 To compute M, we start by comparing u1 and u2 If u1 is greater than u2, then M = max(u1, u3). Otherwise, M = max(u2, u3). For the Isgreater column, we use the following logic: d <- d %>% mutate(M = ifelse(u1 > u2, ifelse(u1 > u3, u1, u3), ifelse(u2 > u3, u2, u3)), Isgreater = ifelse(M > 0.75, 1, 0)) # Display a sample of simulated data knitr::kable(head(d, 10), caption = 'Sample of Simulated Data') # Calculate required probability pr <- sum(d$M)/nrow(d) # Display probability pr ``` b) Let $X_1,\\ldots,X_n$ be Poisson random variables with parameter $\\lambda=0.5$ where $n=21$. Estimate the probability that the sample mean is greater than the sample median using monte carlo simulation. ```{r} # Ascribe input values lambda <- 0.5 sample.size <- 21 nSim <- 5000 # number of simulations # Create function to simulate whether mean is greater than median simulate <- function (l = lambda, s.size = sample.size) { s <- rpois(s.size, l) # randomly draw a sample from Poisson distribution mea <- mean(s) # Find the mean med <- median(s) # Find the median answer <- ifelse(mea > med, 1, 0) # Check if mean is bigger than median. return(answer) } # Run the simulation several times (nSim times) sim <- mapply(FUN = simulate, MoreArgs = NULL,1:nSim) # Calculate the required probability pr <- sum(sim)/ n # Display the result pr ``` c) Let $X_1,\\ldots,X_{100}$ be exponential random variables with rate $\\lambda=2$. Plot the histogram of $\\bar{X}$ and overlay normal density to the histogram. ```{r message = FALSE} # Load plotting package suppressPackageStartupMessages(library(ggplot2)) # Randomly draw 100 observations from Exp(2) distribution and save in d d <- data.frame(x = rexp(n = 100, rate = 2)) # Overlay histogram, empirical density and normal density ggplot(d, aes(x)) + geom_histogram(aes(y = ..density..)) + stat_function(fun = dnorm, args = list(mean = mean(d$x), sd = sd(d$x)), lwd = 1, col = 'blue') + theme_bw() + ggtitle('Histogram of Exponentially Distributed Random Variable \ with Normal Density Overlay') ``` \\textbf{Problem 5} Write a function pi3(n) that approximates $\\pi$ as a function of n, by simulating random points in the square with vertices (-1,-1), (-1,1), (1,1), and (1,-1), seeing what fraction of them are in the unit circle [the circle with radius 1 centered at the origin], and then converting this fraction into an estimate of $\\pi$. Evaluate pi3(10^j) for $j = 0,1,2\\ldots,6$. For $j=6$, plot your simulated points, using different plotting symbols and colors for simulated points inside and outside the unit circle. Don't forget to use set.seed. (To get square axes use par(pty="s")). ```{r} # Create pythagorus theorem function to compute distance from origin getDistFromOrigin <- function (x, y) { radius.squared <- (abs(x))^2 + (abs(y))^2 # Pythagorus theorem return(sqrt(radius.squared)) } # Create the function to simulate n points inside the square with # vertices (-1,-1), (-1,1), (1,1), and (1,-1). simulatePoints <- function (n) { # Simulate x coordinates set.seed(1) x <- runif(n, -1, 1) # Simulate y coordinates set.seed(2) y <- runif(n, -1, 1) # Calculate distance from origin (dfo) dfo <- getDistFromOrigin(x, y) # Determine if simulated point is in circle (isincircle) isincircle <- ifelse(dfo <= 1, 1, 0) # Save results in a data frame d <- data.frame(x, y, dfo, isincircle) return(d) } # Create function pi3(n) to estimate pi pi3 <- function(n) { # Simulate the points d <-simulatePoints(n) # Count number of points in circle nIncircle <- sum(d$isincircle) # Calculate the fraction of points in circle portion <- nIncircle / n } # Calculate pi pi = 4 * portion return(pi) # Evaluate pi3(10^j) for J = 0,1, ..., 6 j <- 0:6 n <- 10^j piEstimate <- mapply(FUN = pi3, MoreArgs = NULL, n) #Summarise results data <- data.frame(j, n, piEstimate) knitr::kable(data) # Plot for j = 6 d1 <- simulatePoints(10^6) ggplot(data = d1, aes(x, y, color = factor(isincircle))) + geom_point(size = 1) + scale_color_manual("Is in Circle?\ ", labels = c("No", "Yes"), values = c("blue", "red")) + coord_fixed() + theme_bw() + ggtitle('Plot of Simulated Points') ```
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started