## Lines, Sines, and Curve Fittings 13 – Fourier

To date, I have used two methods to examine the apparent sine wave in the annualized surface temperature date. I looked for a best fit to the residuals of detrended (linear, exponential) data. I have also just *a priori* added a sine component to the best fits of the data (manual fits, nls). Also, while working through this, I have at times withheld some of the data to check the fits *a posteriori*. This has yielded a range of possible periodicities:

56: lines-sines-and-curve-fitting-1-oh-my

50: lines-sines-and-curve-fitting-2-r

50 (second 140): lines-sines-and-curve-fitting-3-double-down

61,71 (second 141): lines-sines-and-curve-fittings-4-walk-and-chew-gum

50,56,61: lines-sines-and-curve-fitting-5-a-growth

58,59,59: lines-sines-and-curve-fitting-11-more-extrapolation

Its time to bring a new tool to the problem: Fast Fourier Transform. Specifically, we will use the spectrum function in R which is wrapper for spec.pgram. The latter is “*calculates the periodogram using a fast Fourier transform, and optionally smooths the result with a series of modified Daniell smoothers (moving averages giving half weight to the end values)*.”

Recall the GISTEMP annual temperature anomaly with a linear trend.

We use *spectrum* to examine the residuals.

I have two questions. Can we trust the values this close to the endpoints? And is the spectrum displayed significant? I’m not sure how to approach the first question. But the second might be answered by brute force.

We check the residuals for normality. Where ‘y’ is the annualized GISTEMP data and ‘y1’ is the linear trend.

# test the residuals of the linear trend res1 <- y-y1 shapiro.test(res1) # W = 0.994, p-value = 0.8588

Since its normal, we create 10000 records of length of 131 data points and examine their spectrum for frequencies with a value as strong as that displayed in the residuals.

testsp <- function(x) { tau = 0.17 # threshold sp = tau) } # this takes less than a minute mc <- matrix(NA,ncol=10000,nrow=length(res1)) mc <- apply(mc,2,function(x){rnorm(length(x),sd=sd(res1))}) mct <- apply(mc,2,testsp) sum(mct)/length(mct) # [1] 0.0012

Out of 10000 random records, only 0.12% of them show frequencies at that level.

We have also been examining exponential fits.

Do the residuals of the exponential fit show the same results?

Indeed, we don’t even have to bother with the residuals. We can extract this signal directly from GISTEMP.

All three surface records show the same signal, although it is weakest in GISTEMP.

# find the freq of the peak max(sp$spec) # [1] 0.1718493 GIS # [1] 0.2979760 CRU # [1] 0.2647387 NOA freq <- 1/sp$freq[sp$spec==max(sp$spec)] freq # [1] 67.5 GIS # [1] 67.5 CRU # [1] 67.5 NOA

The script can be found here

Update. I decided to run the data just for 1900-2000. The spectrum analysis for this restricted data range finds the primary period to be 54 years in all three datasets.