Home > GIStemp, LSCF, Statistics > Lines, Sines, and Curve Fitting 8 – D'Agostino

Lines, Sines, and Curve Fitting 8 – D'Agostino

2011 January 16

The eyeball and quick sigma population checks in the previous post provided some confidence that the global temperature anomalies are normally distributed over the mean. But there are more formal tests, including D’Agostino normality test.

From wiki:

In statistics, D’Agostino’s K2 test is a goodness-of-fit measure of departure from normality, that is the test aims to establish whether or not the given sample comes from a normally distributed population. The test is based on transformations of the sample kurtosis and skewness, and has power only against the alternatives that the distribution is skewed and/or kurtic.

http://en.wikipedia.org/wiki/D’Agostino’s_K-squared_test

D’Agostino tests the skew and kurtosis of a distribution. Failing the test indicates that the distribution is skewed or kurtic to the point that it is not normal. Passing the test is not proof positive that the distribution is in fact normal.

We’ll first take a look at three intentionally distorted distributions: a high kurtosis, a low kurtosis, and a skewed distribution. I had trouble skewing the distribution without also trigging kurtic indicators in the D’Agostino test. The results of each D’Agostino test follows the displayed distribution.

The D’Agostino test is included in a financial basics package from rmetrics.org.

dagoTest(rn5)

Title:
 D'Agostino Normality Test

Test Results:
  STATISTIC:
    Chi2 | Omnibus: 43.6439
    Z3  | Skewness: -0.5324
    Z4  | Kurtosis: 6.5849
  P VALUE:
    Omnibus  Test: 3.333e-10
    Skewness Test: 0.5945
    Kurtosis Test: 4.553e-11

dagoTest(rn6)

Title:
 D'Agostino Normality Test

Test Results:
  STATISTIC:
    Chi2 | Omnibus: 34.441
    Z3  | Skewness: -0.3546
    Z4  | Kurtosis: 5.8579
  P VALUE:
    Omnibus  Test: 3.321e-08
    Skewness Test: 0.7229
    Kurtosis Test: 4.687e-09

dagoTest(rn7)

Title:
 D'Agostino Normality Test

Test Results:
  STATISTIC:
    Chi2 | Omnibus: 41.7927
    Z3  | Skewness: -5.8956
    Z4  | Kurtosis: 2.6523
  P VALUE:
    Omnibus  Test: 8.41e-10
    Skewness Test: 3.734e-09
    Kurtosis Test: 0.007994

It appears that very low values of the p-value indicate that the distribution does not pass the given test for Omnibus (the overall D’Agostino test for ‘could be normal’) or the subcomponents for Skewness or Kurtosis.

Now we can look at the four distributions that we looked at in the previous post. Each set contains more residual data points from the mean than the one before.

1) GISTEMP annual 1970-2010
2) GISTEMP annual 1880-2010
3) GISTEMP monthly 1970-2010
4) GISTEMP monthly 1880-2010

Test Results:
  STATISTIC:
    Chi2 | Omnibus: 3.3396
    Z3  | Skewness: -0.4131
    Z4  | Kurtosis: -1.7801
  P VALUE:
    Omnibus  Test: 0.1883
    Skewness Test: 0.6795
    Kurtosis Test: 0.07505

Test Results:
  STATISTIC:
    Chi2 | Omnibus: 0.6071
    Z3  | Skewness: 0.0594
    Z4  | Kurtosis: -0.7769
  P VALUE:
    Omnibus  Test: 0.7382
    Skewness Test: 0.9526
    Kurtosis Test: 0.4372

Test Results:
  STATISTIC:
    Chi2 | Omnibus: 0.0225
    Z3  | Skewness: -0.1197
    Z4  | Kurtosis: 0.0904
  P VALUE:
    Omnibus  Test: 0.9888
    Skewness Test: 0.9047
    Kurtosis Test: 0.928

Test Results:
  STATISTIC:
    Chi2 | Omnibus: 0.3223
    Z3  | Skewness: 0.032
    Z4  | Kurtosis: -0.5668
  P VALUE:
    Omnibus  Test: 0.8512
    Skewness Test: 0.9745
    Kurtosis Test: 0.5709

The tests results do not uniformly improve with increasing data points. The monthly 1880-2010 set of residuals are not an improvement over the monthly 1970-2010. This is probably an indication that the 1940-1970 cooling trend is throwing off the distribution around a linear mean.

On the other hand, the 1880-2010 yearly is a definite improvement over the 1970-2010 yearly. Probably an improvement due to the increase in the number of points available.

Follow

Get every new post delivered to your Inbox.

Join 27 other followers