Question

1 Approved Answer

Posted on Jun 25, 2024

Need help with python hw for computational physics by Mark Newman Please look at the jupyter notebook links below for reference. Please at least check

Need help with python hw for computational physics by Mark Newman

Please look at the jupyter notebook links below for reference.

Please at least check the IPNYB provided below

Discrete cosine transform - Exercise 7.6 The discrete cosine transform is a type of Fourier transform that uses only cosines to represent our function / (x) or our data y,- It is not intuitively obvious what advantage a discrete cosine transform (DCT) has over the general discrete Fourier transform (DFT). So to help motivate the advantages of the DCT, I am going to go back to my DFT of the Mauna Loa carbon dioxide data set. Again, I first plot the data then I plot its DFT. In [26]: # Calculate DFT for Mauna Loa CO2 data with mean removed import numpy as np from cmath import exp, pi, phase import matplotlib. pyplot as plt # define function to calculate DFT based on eqn 7.20 def aft (y) : # get the number of sampled points N=len(y) # since c_k are complex, create a complex array to hold c = np. zeros ( [N], complex) # calculate c_k for k=0, to N-1 for k in range (N) : for n in range (N) : c[k] += y[n]*exp(-2j*pi*k*n/N) return c # directory path to data file path = "G:\\\\my Drive\\\\Classes\\\\Ph322\\\\Lectures\\\\Notebooks\\\\" file = "monthly_in_situ_co2_mlo. txt" # Lood in the file "monthly_in_situ_coz_mlo. txt" data = np. loadtxt(file) # col 2 contains date in units of days day = data[ : , 2] # col 3 contains date in units of years year = data[: , 3] # col 4 contains CO2 in units of ppb co2 = data[: , 4] co2 -= np . mean (co2) #plot co2 vs year pit. plot (year, co2, pit. xlabel("Year") pit. ylabel("[co2], ppm") pit . show() # calculate the DFT of the co2 time series c = dft(co2) # plot the DFT pit. plot (abs(c), ' -") pit. xlabel("k") pit. ylabel(r"$c_ks") pit. title("DFT") #plt.xlim(120, 140)8 [CO2], ppm -20 -40 1960 1970 1980 1990 2000 2010 2020 Year out [26]: Text(0.5, 1.0, 'DFT' ) DFT I am again going to calculate the inverse DFT, that is recover the original data set using the DFT coefficients c, I calculated above. However, this time I am going to plot the inverse DFT over a range that extends beyond the original range of the data set. That is, instead of plotting the inverse DFT just from n=0, N-1 (in this case N=744 or 744 months of data) I am going to plot over the range -2N, 2N. Let's see what it looks like: In [27]: # Plot the sum of the first kmax terms of the Fourier series # using DFT calculated above import numpy as np import matplotlib. pyplot as plt T = 744 # period in units of months N = len(c) kmax = N-1 # Add Fourier series terms up to kmax term xvalues = np. arange (-2*N, 2*N, 1) yvalues = np. empty( [N, N], complex) yn = np. zeros ([len(xvalues)], complex) for k in range (6, kmax+1) : #for k in range (62, 63): for x in range (len (xvalues) ) : yn [x] += (c[k]/N)*exp( (2j*np. pi*k/T)*xvalues[x] ) pit. plot (xvalues, yn) plt. xlabel ("Months") C:\\anaconda3\\lib\\site-packages\ umpy\\core\\_asarray . py:85: ComplexWarning: Casting complex values to real discards the imaginary part return array (a, dtype, copy=False, order=order) out [27]: Text(0.5, 0, "Months" )out [27]: Text(0.5, 0, 'Months' ) 60 40 -20 -40 -1500 -1000 -500 500 1000 1500 Months Interesting! We see that when we plot the recovered function (or in this case our recovered data) beyond the interval we originally sampled it, the function just repeats itself. If we were to extend this graph to infinity in both directions, the pattern would continue. If you remember from last class, we said there is an exact one-to-one correspondence between the information stored in the DFT and the information stored in the inverse DFT. This means that in the "eyes" of the DFT the data set we are transforming is in fact this periodic function. This is the function the DFT is "trying" (from a teleological viewpoint) to approximate. If we represent our data set as the function /(x) then f(x) = f(x + N). One problem with this is, this function has discontinuities in it. You can see that the function drops from about 60 at the end of each period to -40 at the next point This discontinuity is difficult to represent with sinusoids. And while the DFT can indeed do this, it takes a lot of terms to do so. This explains why we see a lot of non-zero coefficients in the plot of the DFT. These are necessary in order for the Fourier series to approximate the above periodic function of our data set. In the plot below I zoom into the DFT so I can see these non-zero coefficients. In [28]: # plot the DFT and zoom in to see the non-zero coefficients pit. plot(abs(c), '-") pit. xlabel("k") pit. ylabel(r"$c_kg") pit . title("DFT") pit. ylim(6, 300) out [28]: (0, 300)DFT 300 250 200 150 100 50 100 200 300 400 500 600 700 Now if you are using a DFT to identify signals or periodicities in your data, these sinusoids needed to represent the discontinuties may end up masking the true signals in the data. In other cases, such as the case here with the CO2, it may not matter so much if the physical signals have coefficients large enough to stand above the background However in the case where Fourier transforms are used for data compression, e.g. compressing images or audio, the additional sinusoids needed to represent these discontinuities will decrease the efficiency of the compression method, making for larger data sizes. DFTs and compression What do I mean here? Well let's say I wanted to transmit this data set of CO2 to a colleague. Let's assume that the file size of the raw data is quite large, too large for my colleague to handle on their slow internet service. Instead I have the brilliant idea that instead of transmitting the data itself, I will transmit the DFT of it, knowing that all the information that is in the original data is stored in the DFT itself. Once receiving my DFT my colleague will be able to recover the data by calculating the inverse DFT. Now if I transmitted all the coefficients from the DFT, there would be no gain in efficiency, since for N measurements there are N coefficients. However, I realize that many if not most of the coefficients are quite small, meaning they contribute very little information to the data set. Therefore I can omit these coefficients from the DFT with no significant information loss. I then only have to transmit the largest coefficients to my colleague, which could be a significant decrease in file size. What percentage of coefficients can I drop? Well that depends on my data set (or function) and how much information loss I am willing to live with.Exercise 7.6 Exercise 7.4 looked at data representing the variation of the Dow Jones Industrial Average, colloquially called "the Dow," over time. The particular time period studied in that exercise was special in one sense: the value of the Dow at the end of the period was almost the same as at the start, so the function was, roughly speaking, periodic. In the on-line resources there is another file called dow2 txt, which also contains data on the Dow but for a different time period, from 2004 until 2008. Over this period the value changed considerably from a starting level around 9000 to a final level around 14000. a) Write a program in which you read the data in the file dow2 txt and plot it on a graph. Then smooth the data by calculating its complex Fourier transform, setting all but the first 2% of the coefficients to zero, and inverting the transform again, plotting the result on the same graph as the original data. You should see that the data are smoothed, but now there will be an additional artifact. At the beginning and end of the plot you should see large deviations away from the true smoothed function. These occur because the function is required to be periodic--its last value must be the same as its first---so it needs to deviate substantially from the correct value to make the two ends of the function meet. In some situations (including this one) this behavior is unsatisfactory. If we want to use the Fourier transform for smoothing, we would certainly prefer that it not introduce artifacts of this kind Make sure your program outputs the original and smoothed data on the same plot, with legend and title indicating its the DFT. In [7]: from numpy import loadtxt, zeros import matplotlib. pyplot as plt from numpy. fft import rift, irfft data = loadtxt("dow2. txt", float) c = rift(data) n = len(c) m1 = int(n * 0.1) c1 = zeros(n, complex) for i in range(m1) : ci[i] = c[i] datal = irfft(c1) m2 = int(n * 0.02) c2 = zeros(n, complex) for i in range (m2) : c2[i] = c[i] data2 = irfft(c2) pit. plot(data, 'b', label="original data" ) pit. plot(data2, 'g', label="IFFT by first 2% data') pit. title("The daily closing value for each business day of the Dow Jones + Industrial Average") pit. xlabel( 'month' ) pit. xlabel("Day" ) pit . legend ( ) Out [7]: The daily closing value for each business day of the Dow Jones - Industrial Average 14000 - original data IFFT by first 2% data 13000 12000 11000 10000 200 400 600 800 1000 Day/10pts. b) Modify your program to repeat the same analysis using discrete cosine transforms. To do the DOT, make the data symmetric by appending a flipped copy of itself to make the data twice as long. This also makes it symmetric, so the transform can be represented as a cosine series. You can use the functions from dest.py to perform the transforms if you wish. Discard all but the first 2% of the coefficients, invert the transform, and plot the result. You should see a significant improvement, with less distortion of the function at the ends of the interval. This occurs because, as discussed at the end of Section 7.3, the discrete cosine transform does not force the value of the function to be the same at both ends. Windowing is another method of ensuring periodicity of the data. Make sure your program outputs the original and smoothed data on the same plot, with legend and title indicating its the DCT. In [3]: #Type your code here /10pts. Overall: /100pts. In [ ]