Home > GIStemp, LSCF, Statistics > Lines, Sines, and Curve Fittings 13 – Fourier

Lines, Sines, and Curve Fittings 13 – Fourier

2011 February 13

To date, I have used two methods to examine the apparent sine wave in the annualized surface temperature date. I looked for a best fit to the residuals of detrended (linear, exponential) data. I have also just a priori added a sine component to the best fits of the data (manual fits, nls). Also, while working through this, I have at times withheld some of the data to check the fits a posteriori. This has yielded a range of possible periodicities:

56: lines-sines-and-curve-fitting-1-oh-my
50: lines-sines-and-curve-fitting-2-r
50 (second 140): lines-sines-and-curve-fitting-3-double-down
61,71 (second 141): lines-sines-and-curve-fittings-4-walk-and-chew-gum
50,56,61: lines-sines-and-curve-fitting-5-a-growth
58,59,59: lines-sines-and-curve-fitting-11-more-extrapolation

Its time to bring a new tool to the problem: Fast Fourier Transform. Specifically, we will use the spectrum function in R which is wrapper for spec.pgram. The latter is “calculates the periodogram using a fast Fourier transform, and optionally smooths the result with a series of modified Daniell smoothers (moving averages giving half weight to the end values).”

Recall the GISTEMP annual temperature anomaly with a linear trend.

We use spectrum to examine the residuals.

I have two questions. Can we trust the values this close to the endpoints? And is the spectrum displayed significant? I’m not sure how to approach the first question. But the second might be answered by brute force.

We check the residuals for normality. Where ‘y’ is the annualized GISTEMP data and ‘y1’ is the linear trend.

# test the residuals of the linear trend
res1 <- y-y1
# W = 0.994, p-value = 0.8588

Since its normal, we create 10000 records of length of 131 data points and examine their spectrum for frequencies with a value as strong as that displayed in the residuals.

testsp <- function(x) {
    tau = 0.17 # threshold
    sp = tau)

# this takes less than a minute
mc <- matrix(NA,ncol=10000,nrow=length(res1))
mc <- apply(mc,2,function(x){rnorm(length(x),sd=sd(res1))})
mct <- apply(mc,2,testsp)
sum(mct)/length(mct) # [1] 0.0012

Out of 10000 random records, only 0.12% of them show frequencies at that level.

We have also been examining exponential fits.

Do the residuals of the exponential fit show the same results?

Indeed, we don’t even have to bother with the residuals. We can extract this signal directly from GISTEMP.

All three surface records show the same signal, although it is weakest in GISTEMP.

# find the freq of the peak
# [1] 0.1718493 GIS
# [1] 0.2979760 CRU
# [1] 0.2647387 NOA

freq <- 1/sp$freq[sp$spec==max(sp$spec)]
# [1] 67.5 GIS
# [1] 67.5 CRU
# [1] 67.5 NOA

The script can be found here

Update. I decided to run the data just for 1900-2000. The spectrum analysis for this restricted data range finds the primary period to be 54 years in all three datasets.