Home > CRUTEMP > CRUTEM: Replication -v- Reproduction

CRUTEM: Replication -v- Reproduction

2010 July 5

After my initial work on CRUTEM, I left it behind as I turned to work on GHCN metadata and then GSOD data set. Others, better qualified, turned to working the station anomalization and gridding methodologies. So it was with some surprise I bumped into comment whose author was bemoaning the lack of “replication” of CRUTEM.

This thread is an open invitation to those concerned with the validity of CRUTEM’s methodology to describe what it is they would like to see along the lines of “reproduction.” I am not interested in obtaining Dr Jone’s code. I am interested in what is thought to be required for an independent verification.


1953: On the causes of instrumentality observed secular temperature trends

1982: Variations in surface air temperatures: Part 1. Northern hemisphere, 1881–1980

1985: A grid point surface air temperature data set for the Northern Hemisphere

1986: Grid point surface air temperature data set for the Southern Hemisphere

1988: Hemispheric surface air temperature variations: Recent trends and an update to 1987

1992: Global surface air temperature variations during the twentieth century: Part 1, spatial, temporal and seasonal details

1993: Global surface air temperature variations during the twentieth century: Part 2, implications for large-scale high-frequency palaeoclimatic studies

1994: Hemispheric surface air temperature variations: a reanalysis and an update to 1993

1997: Estimating Sampling Errors in Large-Scale Temperature Averages

1999: Surface air temperature and its variations over the last 150 years.

2001: Adjusting for sampling density in grid box land and ocean surface temperature time series

2003: Hemispheric and Large-Scale Surface Air Temperature Variations: An Extensive Revision and an Update to 2001″

2006: Uncertainty estimates in regional and global observed temperature changes: a new dataset from 1850

  1. carrot eater
    2010 July 5 at 8:56 am

    Re-doing exactly what GISS does is easy. Re-doing exactly what NCDC does will presumably be easy once GHCN v3 is out.

    Re-doing exactly what CRU does is not easy, because they are much less uniform in how they go about things. But does it matter? People making reasonable choices on methodology get results that are confirm all of them – CRU, NCDC, GISS.

  2. 2010 July 5 at 9:11 am

    Does validating CRU matter? yes
    Does reconstructing their methodology “line of code by line of code”? no

    As far as I’m concerned, CRU’s general results are validated by similar results from other institutions (GISS, NCDC, JMA) and by the 6-or-so technical blogger reconstructions.

    Yet, I think it will be good exercise to see how much detail has been published and how much is left to be derived. I expect that quite a bit falls into the ‘derived’ category (which I don’t see as necessarily a problem, as long as it can be worked through.)

    Remember, this is a ‘learn as I go’ blog. I don’t come with tools to apply to the problem. I just come with the confidence those tools exist and can be learned by numerate, algorithmically minded folk (which, I suspect, is why so many IT guys insert themselves into the debate 😉 )

  3. 2010 July 6 at 12:52 pm

    Combining GHCN v2.mean_adj and HadSST2 gets you close enough. Perfect reproduction is overrated 😛

  4. carrot eater
    2010 July 6 at 1:11 pm

    That said, I personally prefer GISS and NCDC over CRU, because I know exactly what they’ve done to every single station. GISS is particularly straightforward. NCDC is less simple, but still systematic.

    That they’re all basically consistent with each other raises the confidence. If GISS and NCDC didn’t exist, and if v2.mean weren’t publicly available, then I would be more concerned about CRU. As I suppose everybody would be.

  5. steven Mosher
    2010 July 9 at 12:13 am

    Ah well,

    I’d still like to see the code. I do recall a bunch of people accepting Nick’s work. Then there was
    the small matter of an error in his code. It was minor. And he fixed it. The issue is this.

    1. there is no reason not to share it.
    2. building on their code is probably more productive than rewriting from scratch.
    3, It builds confidence that nothing is being hidden
    4. It’s much easier to clarify a point made by the paper by looking at the code. Especially with
    GISTEMP there are things that are NOT covered in the paper ( see ushcn integration)
    and the handling of duplicates.
    5. It’s the actual science, rather than a description of the science.

  6. carrot eater
    2010 July 9 at 1:24 am

    I don’t think the code you imagine entirely exists at CRU. I think they’ve released some utilities, like the gridder. But I don’t think they have a systematic setup like GISS where a single program reads in a small number of text files of data, and automatically does all their processing – adjust, combine, grid. It appears to me that CRU does not apply the same homogenisation technique to every station; some of the homogenisation is done by hand. Or at least, it was, at some point. Some of the homogenisation may be done by the providing country.

    “2. building on their code is probably more productive than rewriting from scratch.”

    If it can be done in a couple days, then I disagree.

    But yes, having somebody else look through your code raises the chance that the small errors are found.

  7. carrot eater
    2010 July 9 at 1:33 am

    and 4 “Especially with
    GISTEMP there are things that are NOT covered in the paper ( see ushcn integration)”

    This has changed over time; they don’t publish a new paper each time. That’s what the update page on the website is for.

  8. 2010 July 9 at 5:44 am

    I have become completely uninterested in their ‘anomalization and gridding’ code. The process is no longer a mystery to me, thanks to technical bloggers. Having GISTEMP code never really reduced the mystery. 🙂

    Now, its all about the ‘homogenization.’

  9. carrot eater
    2010 July 9 at 6:27 am

    Right, and that’s what I’m getting at. The anomalies and gridding, that’s relatively easy, and a quick description is all you need to understand what they did. But that code is there, if you want it.

    Homogenisation is another story.

    GISS is conceptually easy, to the point of being too simple, though it takes a bit of programming.

    NOAA.. I still haven’t worked on Menne 2009. Glancing at it, I bet it’s described well enough to implement for yourself, without peeking.

    CRU.. all over the place, I think. The step where “Jones sits down and applies his professional judgment, in regards to a station move” (and I think there is some of that sort of thing in there, if I’m not mistaken) can’t be captured by any computer code. But then for some other station, they might do something like what NOAA did for GHCN v2.0. I don’t know, I’m really not very familiar with them; this is just what I got from looking over Brohan and Jones papers.

  1. 2010 July 10 at 8:55 am
Comments are closed.