Answered step by step
Verified Expert Solution
Question
1 Approved Answer
Let X1, X2, ..., xn be n observations of a variable of interest. Recall that the sample mean n and sample variance s are given
Let X1, X2, ..., xn be n observations of a variable of interest. Recall that the sample mean n and sample variance s are given by in = 3 X and s = -- ( (Equation 1) nkei where here the subscript n's indicate the number of observations in the sample. Notice that a natural computation of the variance requires two passes over the data: one to compute the mean, and a second to subtract the mean from each observation and compute the sum of squares. It is often useful to be able to compute the variance in a single pass, inspecting each value xk only once; for example, when the data are being collected without enough storage to keep all the values, or when costs of memory access dominate those of computation. In this problem you will explore two methods for such an online computation of the mean. Part A: Show algebraically that the following relation holds between the mean of the first n - 1 observations and the mean of all n observations: xn - Xn-1 in = xn-1 + Note that you can get an expression for Xn-1 by simply replacing n in Equation 1 above with n - 1. Part B: Write a function my_sample_mean that takes as its input a numpy array and returns the mean of that numpy array using the formulas from class (written above). Write another function my_sample_var that takes as its input a numpy array and returns the variance of that numpy array, again using the formulas from class (written above). You may not use any built-in sample mean or variance functions. Part C: Use your functions from Part B to compute the sample mean and sample variance of the following array, which contains the plankton consumed by a sample of 12 seahorses. pla = [98, 26, 83, 56, 60, 39, 81, 19, 72, 78, 94, 42] Part D: Implement a third function called update_mean that implements the formula whose validity you proved in Part A. Note that this function will need to take as its input three things: Xn, Xn-1 and n. A function header is provided for you. This function may be auto-graded, so please do not change the given API - the order of inputs matters! If you change it, you might lose points. Use this function to compute the values that you get from taking the mean of the first seahorse's plankton, the first two seahorses, the first three seahorses, and so on up to all of the seahorse data points. Store your plankton means in a numpy array called pla_means . Report all 12 estimates in pla_means. In [4]: # Given API: def update_mean(prev_mean, xn, n): # your code goes here! return #the updated mean To ensure your function complies with the given API, run this small test, where we suppose we have a mean of xn = 1 with the first 2 data points (prev_mean ), and we update this with the 3rd (n = 3) data point which is xz = 2: In [5]: assert update_mean(1,2,3)==4/3, "Warning: function seems broken
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started