Charles Pierce: Methods of Monthly Means
A post on WUWT on an early 19thC temperature record got me to wondering about difference between modern methods of deriving a monthly average and Pierce’s method.
The post was based on material available in two different sources: Pierce’s A meteorological account of the weather in Philadelphia: from January 1 1790 to January 1 1847 (pages 156-157) and another which was a brief reprint of the data in The American Quarterly Register.
Pierce’s observations were described as follows: “The record of each day was made at or before sunrise, and at two. and ten o’clock, P.M.” I was curious as to how this might differ from modern temperature records which are usually taken as the average of the daily maximum and minimum temperatures.
Data Sources: To test the differences, I pulled down station data for four Pennsylvania weather stations for the month of January 2010: KPABETHL10, KPABLOOM2, KPACHEST4 KPAEASTO7. Examining the station records, it appeared that KPAEASTO7 had several days of missing data – so I discarded it.
Since I’m not sure that Weather Underground approves of data scraping software, I’ll just point to sample record for KPABLOOM2 on Jan 1, 2010.
Pierce’s Method: To emulate Pierce’s method, I collected the first observation for each day after 7am, 2pm, and 10pm – but not extending into the next hour. If data was not available within a particular hour on a given day, it was left null and the null observation was simply excluded from the average.
Tave = SUM[foreachDay(T.07 + T.14 + T.22)]/(num_of_non_null_obs)
The code for this is as follows:
# read the daily files, any order
for i in `ls -b *.csv`
t07=`grep ” 07:..:” $i | head -1 | cut -d”,” -f2`
t14=`grep ” 14:..:” $i | head -1 | cut -d”,” -f2`
t22=`grep ” 22:..:” $i | head -1 | cut -d”,” -f2`
echo $i $t07 $t14 $t22
echo $t07 >> t.txt
echo $t14 >> t.txt
echo $t22 >> t.txt
# remove nulls
grep -v 999 t.txt > t2.txt
# remove blank lines
sed ‘/^$/d’ -i t2.txt
# sum the obs, sum the count
for i in `cat t2.txt`
t1=`echo “scale=2; $t1 + $i” | bc`
cnt=$(($cnt + 1))
# calc the simple ave
echo “scale =2; $t1/$cnt” | bc
The Max,Min Method: Then I went over the same records with the Tmax,Tmin method, skipping any days with nulls I might have found. I believe that this is the method used to generate the GHCN montly means found in v2.mean.
Tave = SUM[foreachDay(T.max + T.mean)]/(2*num_of_days)
# read the files in any order
for i in `ls -b *.csv`
# find tmax
tmax=`cut -d”,” -f2 $i | grep -v “Temperature” | grep -v “\-\-” | grep -v 999.9 | sort -n -r | head -1`
# find tmin
tmin=`cut -d”,” -f2 $i | grep -v “Temperature” | grep -v “\-\-” | grep -v 999.9 | sort -n | head -1`
# exclude nulls
if [ “$tmax” != “-999.9” ]
if [ “$tmin” != “-999.9” ]
t1=`echo “scale =2; $t1 + $tmax + $tmin” | bc`
echo $i $cnt $tmax $tmin
# calc the ave
echo “scale = 2; $t1 / (2 * $cnt)” | bc
SUM(foreachDay(T.max+T.min))/(2*num of days) => 33.3
SUM(T.obs)/(num of obs) => 33.3
SUM(foreachDay(T.max+T.min))/(2*num of days) => 30.2
SUM(T.obs)/(num of obs) => 30.2
SUM(foreachDay(T.max+T.min))/(2*num of days) => 28.3
SUM(T.obs)/(num of obs) => 28.1
It appears that the T7,14,22 and Tmax,min methods provide substantially the same results for the monthly average temp. KPABLOOM2 had a larger number of null values than the other two stations which likely led to the slight difference for the two methods on that station.