Question
Assignmnet 1, Q4 from mpl_toolkits.mplot3d import Axes3D import matplotlib.pyplot as plt import numpy as np import random import sys def plot_gaussian(x, y, filename=None):
""" Assignmnet 1, Q4 """ from mpl_toolkits.mplot3d import Axes3D import matplotlib.pyplot as plt import numpy as np import random import sys
def plot_gaussian(x, y, filename=None): """ Plot the multivariate Gaussian
If filename is not given, then the figure is not saved. """ # Note: there was no need to make this into a separate function # however, it lets you see how to define functions within the # main file, and makes it easier to comment out plotting if # you want to experiment with many parameter changes without # generating many, many graphs fig = plt.figure() ax = fig.add_subplot(111) ax.scatter(x,y, c="red",marker="s") ax.set_xlabel("X") ax.set_ylabel("Y")
minlim, maxlim = -3, 3 ax.set_xlim(minlim,maxlim) ax.set_ylim(minlim,maxlim) if filename is not None: fig.savefig("scatter" + str(dim) + "_n" + str(numsamples) + ".png") plt.show()
if __name__ == '__main__':
# default dim = 1 numsamples = 100
if len(sys.argv) > 1: dim = int(sys.argv[1]) if dim > 3: print( "Dimension must be 3 or less; capping at 3") if len(sys.argv) > 2: numsamples = int(sys.argv[2]) print("Running with dim = " + str(dim), \ " and numsamples = " + str(numsamples))
# Generate data from (Univariate) Gaussian if dim == 1: # mean and standard deviation in one dimension mu = 0 sigma = 1.0 x = np.random.normal(mu, sigma, numsamples) y = np.zeros(numsamples,) else: # mean and standard deviation in three dimension print("Dimension not supported") exit(0)
#TODO: Get the current estimate of the sample mean
#TODO: Get the current estimate of the sample variance
# Print all in 2d space plot_gaussian(x,y) #plot_gaussian(x,y,"scatter" + str(dim) + "_n" + str(numsamples) + ".png")
For the first two questions, the goal is to understand how much estimators themselves can vary: how different our estimate would have been under a different randomly sampled dataset. In the real world, we will not get to obtain different estimators, we will only have one; in this controlled setting, though, we can actually simulate how different the estimators could be. For the second two questions, the goal is to understand how we obtain confidence intervals around our single sample average estimator. (a) [5 MARKS] Change the code such that it prints the mean and variance of your samples, using Numpy functions. You are omly required to add these two lines in simulate.py. (b) [7 MARKS] Run the code for 10 samples with dim=1 and o? = 1.0. Write down the sample obtain. Now do this another 4 times, giving you 5 estimates of the sample average that average you M1, M2, M3, M4 and Mg. What is the sample variance of these 5 estimates? (c) [7 MARKS] Now run the same experiment, but use 100 samples for each sample average estimate. What is the sample variance of these 5 estimates? How is it different from the variance when you used 10 samples to compute the estimates? (d) [8 MARKSs] Now let us consider a higher variance situation, where o? = 10.0. Imagine you know this variance, and that the data comes from a Gaussian, but that you do not know the true mean. Run the code to get 30 samples, and compute one sample average M. What is the 95% confidence interval around this M? Give actual numbers. (e) [8 MARKS] Now assume you know less: you do not know the data is Gaussian, though you still know the variance is o? = 10.0. Use the same 30 samples from (d) and resulting sample average M. Give a 95% confidence interval around M, now without assuming the samples are Gaussian. For the first two questions, the goal is to understand how much estimators themselves can vary: how different our estimate would have been under a different randomly sampled dataset. In the real world, we will not get to obtain different estimators, we will only have one; in this controlled setting, though, we can actually simulate how different the estimators could be. For the second two questions, the goal is to understand how we obtain confidence intervals around our single sample average estimator. (a) [5 MARKS] Change the code such that it prints the mean and variance of your samples, using Numpy functions. You are omly required to add these two lines in simulate.py. (b) [7 MARKS] Run the code for 10 samples with dim=1 and o? = 1.0. Write down the sample obtain. Now do this another 4 times, giving you 5 estimates of the sample average that average you M1, M2, M3, M4 and Mg. What is the sample variance of these 5 estimates? (c) [7 MARKS] Now run the same experiment, but use 100 samples for each sample average estimate. What is the sample variance of these 5 estimates? How is it different from the variance when you used 10 samples to compute the estimates? (d) [8 MARKSs] Now let us consider a higher variance situation, where o? = 10.0. Imagine you know this variance, and that the data comes from a Gaussian, but that you do not know the true mean. Run the code to get 30 samples, and compute one sample average M. What is the 95% confidence interval around this M? Give actual numbers. (e) [8 MARKS] Now assume you know less: you do not know the data is Gaussian, though you still know the variance is o? = 10.0. Use the same 30 samples from (d) and resulting sample average M. Give a 95% confidence interval around M, now without assuming the samples are Gaussian
Step by Step Solution
There are 3 Steps involved in it
Step: 1
Get Instant Access to Expert-Tailored Solutions
See step-by-step solutions with expert insights and AI powered tools for academic success
Step: 2
Step: 3
Ace Your Homework with AI
Get the answers you need in no time with our AI-driven, step-by-step assistance
Get Started