Kuska and Serahs
Kuska is listed in the GHCN inventory (v2.temperature.inv) as the second record for WMO ID 38974. Serahs is the other entry (aka Saragt aka Serakhs).
22938974001 SERAHS 36.53 61.22 279 286R -9FLDEno-9x-9WARM GRASS/SHRUBC 22938974002 KUSKA 36.53 61.22 625 286R -9FLDEno-9x-9WARM GRASS/SHRUBC
This is strange, since KUSKA (aka Kyshka aka Gyshgy ) has its own WMO ID, 38987 per the WMO Global Observing System.
2 ASIA / ASIE TURKMENISTAN 2018 1726 38974 0 SARAGT 36 32N 61 13E 275 2 ASIA / ASIE TURKMENISTAN 2018 1727 38987 0 GYSHGY 35 17N 62 21E 625
The GHCN mean temperature file (v2.mean) has 5 records for 22938974
229389740010 1903-1907,1914,1935-1989 229389740011 1936-1989 229389740012 1936-1989 229389740020 1904-1908,1913-1917,1921-1989 229389740021 1904-1908,1911-1918,1921-1989
Dupes have been described as flags which describe the same data record as received through two different delivery channels. So I took a quick look at the GHCN data source table to see if I could discern a likely candidate for these stations. Two immediately did, “USSR Network of CLIMAT stations” and “Daily Temperature and Precipitation Data for 223 USSR Stations (NDP-040)”. The search was on.
I tracked down the “USSR Network of CLIMAT stations” as NDP048 aka “Six- and Three-Hourly Meteorological Observations from 223 U.S.S.R. Stations (1998)” first.
This database contains 6- and 3-hourly meteorological observations from a 223-station network of the former Soviet Union. These data have been made available through cooperation between the two principal climate data centers of the United States and Russia: the National Climatic Data Center (NCDC), in Asheville, North Carolina, and the All-Russian Research Institute of Hydrometeorological Information-World Data Centre (RIHMI-WDC) in Obninsk, Russia. The first version of this database extended through the mid-1980s (ending year dependent upon station) and was made available in 1995 by the Carbon Dioxide Information Analysis Center (CDIAC) as NDP-048. A second version of the database extended the data records through 1990. This third, and current version of the database includes data through 2000 for over half of the stations (mainly for Russia), whereas the remainder of the stations have records extending through various years of the 1990s. Because of the break up of the Soviet Union in 1991, and since RIHMI-WDC is a Russian institution, only Russain stations are generally available through 2000. The non-Russian station records in this database typically extend through 1991. Station records consist of 6- and 3-hourly observations of some 24 meteorological variables including temperature, past and present weather type, precipitation amount, cloud count and type, sea level pressure, relative humidity, and wind direction and speed. The 6-hourly observations extend from 1936 through 1965; the 3-hourly observations extend from 1966 through 2000 (or through the latest year available). These data have undergone extensive quality assurance checks by RIHMI-WDC, NCDC, and CDIAC. The database represents a wealth of meteorological information for a large and climatologically important portion of the earth’s land area, and should prove extremely useful for a wide variety of regional climate change studies.
There is more information here:
http://cdiac.ornl.gov/ftp/ndp048/ndp048.pdf (7.5 mb)
min(ndp48_38987$Year) # 1935
max(ndp48_38987$Year) # 1991
min(ndp48_38974$Year) # 1935
max(ndp48_38974$Year) # 1991
However, 1935 includes just 1 entry, the last record in Dec 31.
This provided a clue as to how to process this data.
ds475.0 – U.S.S.R. Surface 6- and 3-hourly Surface Synoptic Observations 1936-1983
The stations in this dataset are considered by RIHMI to comprise one of the best networks suitable for temperature and precipitation monitoring over the the former-USSR. Factors involved in choosing these 223 stations included length or record, amount of missing data, and achieving reasonably good geographic coverage. There are indeed many more stations with daily data over this part of the world, and hundreds more station records are available through NOAA’s Global Historical Climatology Network – Daily (GHCND) database. The 223 stations comprising this database are included in GHCND, but different data processing, updating, and quality assurance methods/checks mean that the agreement between records will vary depending on the station. The relative quality and accuracy of the common station records in the two databases also cannot be easily assessed. As of this writing, most of the common stations contained in the GHCND have more recent records, but not necessarily records starting as early as the records available here.
This database contains four variables: daily mean, minimum, and maximum temperature, and daily total precipitation (liquid equivalent). Temperature were taken three times a day from 1881-1935, four times a day from 1936-65, and eight times a day since 1966. Daily mean temperature is defined as the average of all observations for each calendar day. Daily maximum/minimum temperatures are derived from maximum/minimum thermometer measurements. See the measurement description file for further details.
38974 MOVE 1935 2 -9 0 W 38974 MOVE 1938 -9 -9 0 S 38974 MOVE 1942 -9 -9 0 E 38974 PRCP 1950 10 -9 38974 MOVE 1961 6 12 2 NE 38987 MOVE 1904 4 -9 -9 -99 38987 MOVE 1910 -9 -9 -9 -99 38987 MOVE 1913 8 -9 -9 -99 38987 MOVE 1927 5 -9 -9 -99 38987 PRCP 1953 1 4
A 134 page description of the data set is available here which includes a reprint of A New Perspective on Recent Global Warming: Asymmetric Trends of Daily Maximum and Minimum Temperature (“Bulletin of the American Meteorlogical Society, Vol 74, No 6, June 1993”) http://cdiac.ornl.gov/ftp/ndp040/ndp040.pdf (4 mb)
min(ndp40_38974$Year) # 1936
max(ndp40_38974$Year) # 2001
min(ndp40_38987$Year) # 1904
max(ndp40_38987$Year) # 2001
ds524.0 Russian Summary of Day, 1881-1989
A map of the 223 stations in NDP040 and NDP048
My initial cut at the NDP048 (6 hour records) indicated a strong correlation with the GHCN *011 and *021 records – but it was off. The leading record (the last entry from the day (and year) before) gave me the hint I needed – which was to slip temperature records down one ‘slot’ to include the last entry from the previous day as today, and today’s last record as belonging to the next day. Why would you do this? TOB. I had read a paper two weeks ago that I was planning on writing a post on. This planted the seed I needed. The slight TOP adjustment gave a much better match.
But the real match came with from the NDP040 data formatted as Tmean, Tmax, and Tmin. Taking the daily means of the Tmean provided a near perfect match with GHCN *011 and *021 records.
This shows the difference between GHCN 22938974011 and NDP040 38974.
The match is clearly very similar with the hourly mean data for 223, but a little off.
Likewise for Station 38987 (known in GHCN as 229389740021).
The R-code is not completely automated, but my notes are recorded here: kuska4.R
sum(abs(ghcn_389740011[,3:14] – ndp40_38974a[,3:14]),na.rm=T) # 3
sum(abs(ghcn_389740011[,3:14] – ndp48_38974a[,3:14]),na.rm=T) # 113
sum(abs(ghcn_389740021[,3:14] – ndp40_38987a[,3:14]),na.rm=T) # 0
sum(abs(ghcn_389740021[,3:14] – ndp48_38987a[,3:14]),na.rm=T) # 92
The DSxxx data sets are UCAR archived and requires email contact. I have not pursued that route at this time.
I took a very quick look at GSOD data for these stations, but the match was not close enough to indicate a direct match with the GHCN records.
I processed the NDP048 first and fairly quickly found the match with the GHCN *011 and *021 records. So I *thought* that NDP040 was going to give me GHCN *010 and *020. Imagine my disappointment when it didn’t pan out. On the other hand, the near perfect match of NDP040 with GHCN was quite a pleasant surprise. Maybe someone else can locate the original source for the *010 and *020 records.
In addition, the GHCN records are longer than the related NDP records.
JR in the comments below identifed 012 and 020 as derivations of Tmean=(Tmax+Tmin)/2 from the NDP040. Also identifying 011 as another derivation of Tmid from NDP040.
So I went back, made a few tweaks, and took a look.
# a = monthly mean of Daily Tmids sum(abs(ghcn_389740010[,3:14] - ndp40_38974a[,3:14]),na.rm=T) # 649 sum(abs(ghcn_389740011[,3:14] - ndp40_38974a[,3:14]),na.rm=T) # 3 sum(abs(ghcn_389740012[,3:14] - ndp40_38974a[,3:14]),na.rm=T) # 4035 sum(abs(ghcn_389740020[,3:14] - ndp40_38987a[,3:14]),na.rm=T) # 5348 sum(abs(ghcn_389740021[,3:14] - ndp40_38987a[,3:14]),na.rm=T) # 0 # b = monthly mean of Daily Tmeans = (Tmax + Tmin)/2 sum(abs(ghcn_389740010[,3:14] - ndp40_38974b[,3:14]),na.rm=T) # 3849 sum(abs(ghcn_389740011[,3:14] - ndp40_38974b[,3:14]),na.rm=T) # 4125 sum(abs(ghcn_389740012[,3:14] - ndp40_38974b[,3:14]),na.rm=T) # 159 sum(abs(ghcn_389740020[,3:14] - ndp40_38987b[,3:14]),na.rm=T) # 176 sum(abs(ghcn_389740021[,3:14] - ndp40_38987b[,3:14]),na.rm=T) # 5309
I agree with his analysis that 0012 and 0021 are the mean of the daily mean of Tmax + Tmin from NDP040.
I’m not convinced that the 0010 series originates from either method.