Lines, Sines, and Curve Fitting 7 – normal
Zeke Hausfather has tendered a challenge to Joe Bastardi regarding future warming -v- cooling. A bet similar to the “Did Global Warming Stop …” series I ran through earlier this month.
Zeke describes a portion of the bet as follows:
The graph below shows the trend in annual (Jan-December) temperatures from 1970 to 2010, with two standard deviations of the detrended residuals around the trend to show expected confidence intervals of variability. This means that on average, we would expect only 2.5% of observations to exceed the red upper dotted line and 2.5% of observations to fall below the lower dotted in any given year. The linear trend and confidence intervals for the 1970 to 2010 data are extended up to 2030 to provide a testable projection.
I’ve been meaning to test this. Is the distribution of the detrended global temperature anomalies ‘normal‘? Which is to say, do the residuals around the OLS trend assume a gaussian distribution?
Worried about the counter-trend during 1940-1970, I’ll first just test 1970-2010 using the annualized data for GISTEMP.
Hmmm. That does not look so good. However, there are hopeful signs, 66% of the residuals are within 1 sd of the mean. But there are 0 points out in the 2+ sd tails.
Just as a reminder, here is a normal distribution for 40 random points with a standard deviation of 0.097 (the same as the 40 year GISTEMP sd).
Frankly the random normal data doesn’t look much better than the anomaly data. Just not enough points. So lets go ahead and get more points by using the full 130 years. This does broaden the standard deviation up to 0.127.
A bit better but still pretty ragged, although the anomaly data looks more normal than the random normal data. Here again, we look at a 130 random points generated from a normal distribution for comparisons.
Still not very convincing – for either of them! But the anomaly data has 67% of the points +/- 1 sd of the mean. And 96.2% points are within +/- 2 sd. The normal distribution numbers are improving slightly. We can increase the number of data points available again by using monthly data instead of annual data.
Definitely better. Now 69% of the points are within +/- 1 sd of the mean. And 94.9% points are within +/- 2 sd. The curve is smoother. Yet, we can increase the number of data points still one more time by using the whole data range.
Within +/- 1 sd of the mean, there are 68% of the points. And 95.2% of the points are within +/- 2 sd.
It seems that the temperature anomalies are reasonably close to a normal distribution about the mean. Gonna have to do better than “seems” and “reasonably”, though.