Home > Deconstructing Watts, GHCN > Charles Pierce: Methods of Monthly Means

Charles Pierce: Methods of Monthly Means

2010 February 7


A post on WUWT on an early 19thC temperature record got me to wondering about difference between modern methods of deriving a monthly average and Pierce’s method.

The post was based on material available in two different sources: Pierce’s A meteorological account of the weather in Philadelphia: from January 1 1790 to January 1 1847 (pages 156-157) and another which was a brief reprint of the data in The American Quarterly Register.

Pierce’s observations were described as follows: “The record of each day was made at or before sunrise, and at two. and ten o’clock, P.M.” I was curious as to how this might differ from modern temperature records which are usually taken as the average of the daily maximum and minimum temperatures.


Data Sources: To test the differences, I pulled down station data for four Pennsylvania weather stations for the month of January 2010: KPABETHL10, KPABLOOM2, KPACHEST4 KPAEASTO7. Examining the station records, it appeared that KPAEASTO7 had several days of missing data – so I discarded it.

Since I’m not sure that Weather Underground approves of data scraping software, I’ll just point to sample record for KPABLOOM2 on Jan 1, 2010.

Pierce’s Method: To emulate Pierce’s method, I collected the first observation for each day after 7am, 2pm, and 10pm – but not extending into the next hour. If data was not available within a particular hour on a given day, it was left null and the null observation was simply excluded from the average.

Tave = SUM[foreachDay(T.07 + T.14 + T.22)]/(num_of_non_null_obs)

The code for this is as follows:


# init
rm t.txt
rm t2.txt

# read the daily files, any order
for i in `ls -b *.csv`
t07=`grep ” 07:..:” $i | head -1 | cut -d”,” -f2`
t14=`grep ” 14:..:” $i | head -1 | cut -d”,” -f2`
t22=`grep ” 22:..:” $i | head -1 | cut -d”,” -f2`
echo $i $t07 $t14 $t22
echo $t07 >> t.txt
echo $t14 >> t.txt
echo $t22 >> t.txt

# remove nulls
grep -v 999 t.txt > t2.txt

# remove blank lines
sed ‘/^$/d’ -i t2.txt

# sum the obs, sum the count
for i in `cat t2.txt`
t1=`echo “scale=2; $t1 + $i” | bc`
cnt=$(($cnt + 1))

# calc the simple ave
echo “scale =2; $t1/$cnt” | bc

The Max,Min Method: Then I went over the same records with the Tmax,Tmin method, skipping any days with nulls I might have found. I believe that this is the method used to generate the GHCN montly means found in v2.mean.

Tave = SUM[foreachDay(T.max + T.mean)]/(2*num_of_days)



# read the files in any order
for i in `ls -b *.csv`
# find tmax
tmax=`cut -d”,” -f2 $i | grep -v “Temperature” | grep -v “\-\-” | grep -v 999.9 | sort -n -r | head -1`

# find tmin
tmin=`cut -d”,” -f2 $i | grep -v “Temperature” | grep -v “\-\-” | grep -v 999.9 | sort -n | head -1`

# exclude nulls
if [ “$tmax” != “-999.9” ]
if [ “$tmin” != “-999.9” ]
t1=`echo “scale =2; $t1 + $tmax + $tmin” | bc`
cnt=$(($cnt +1))
echo $i $cnt $tmax $tmin


# calc the ave
echo “scale = 2; $t1 / (2 * $cnt)” | bc


SUM(foreachDay(T.max+T.min))/(2*num of days) => 33.3
SUM(T.obs)/(num of obs) => 33.3

SUM(foreachDay(T.max+T.min))/(2*num of days) => 30.2
SUM(T.obs)/(num of obs) => 30.2

SUM(foreachDay(T.max+T.min))/(2*num of days) => 28.3
SUM(T.obs)/(num of obs) => 28.1

It appears that the T7,14,22 and Tmax,min methods provide substantially the same results for the monthly average temp. KPABLOOM2 had a larger number of null values than the other two stations which likely led to the slight difference for the two methods on that station.

  1. carrot eater
    2010 February 14 at 11:38 am

    Ron, you were asking about CLIMAT. I’ve been looking at the coded reports at this website.


    Do a query, and look at the bottom of the page.

    This document explains the code.


  2. carrot eater
    2010 February 14 at 12:38 pm

    In case I get moderated out at WUWT, I added this about DFW

    “OK, I bet we can figure out what happened to DFW if we work together a bit.

    From the CLIMAT summary, I get:

    Monthly mean temp: -6.2 deg C with at standard deviation of 5.2 C deg
    Max temp: -2.2 deg C
    Min temp: -10.3 deg C

    And zero days of missing data for temperature.

    On 29 days, the min temp was below 0 C.
    On 21 days, the max temp was below 0 C.

    This latter bit looks totally wrong. Either something got messed up at DFW when coding this in, or another station filed their report using DFW’s station code by accident.”

  3. carrot eater
    2010 February 14 at 1:07 pm

    and one more chunk of info; just copying from my WUWT post in case it doesn’t make it through moderation:

    And some more info from the DFW CLIMAT report

    The day with the highest mean temperature: +2.5 C, on Jan 7 and also at least one other day

    The day with the lowest mean temperature: -16.1 C on Jan 1

    The maximum maximum temperature: + 4.4 C on Jan 7
    The minimum minimum temperature: -23.8 C on Jan 2

    This clearly isn’t right, so the NOAA QC was perfectly correct in tossing it out. Maybe somebody at DFW is very confused, or this is the report from some station in Siberia, with somehow the wrong station ID number.

  4. 2010 February 15 at 3:35 pm

    Doh! Thanks for that link!
    I’ve used that site before for SYNOP. 🙂

  5. carrot eater
    2010 February 15 at 9:59 pm

    In principle, a worthwhile project would be to gather up SYNOPs for the parts of the earth that are a bit bare in the GHCN (Africa), and see if they can be used to help fill in the gaps. It’d be a good check on whether GISS’s interpolating is doing a decent job of filling the gaps, or not.

    That said, I could imagine doing this sort-of-manually for one location, but I don’t know how easily one could build a complete database of monthly means from SYNOP reports. Doable?

  1. No trackbacks yet.
Comments are closed.