GHCNv2 and GRUMP Rural and Urban Extents
The GHCN v2.temperature.inv metadata for the GHCN v2.mean temperature records includes a “Rural/Smalltown/Urban” flag. GISTEMP, until very recently, used this flag as part of its urbanization adjustments for non-US countries. This method has been deprecated. GISTEMP is now using satellite brightness as its indicator of urbanization. The GHCNv2 R/S/U flags indicate ‘association’ with a town (10-50K) or city (>50K). An Overview of the Global Historical Climatology Network Temperature Database, Peterson and Vose, 1997.
Population. Examining the station location on an ONC (Operational Navigation Charts) would determine whether the station was in a rural or urban area. If it was an urban area, the population of the city was determined from a variety of sources. We have three population classifications: rural, not associated with a town larger than 10 000 people; small town, located in a town with 10 000 to 50 000 inhabitants; and urban, a city of more than 50 000. In addition to this general classification, for small towns and cities, the approximate population is provided.
The GRUMP Rural/Urban Extents is a gridded dataset with 43200 columns and 16800 rows covering -56S to 84N with a grid size of 0.008333 degrees (1/2 minute). The data set consists of a merging of two sources of information: population and settlement extents. Population data is derived from a variety of sources, primarily national census. Settlement extent is derived from the Defense Meteorological Satellite Program (DMSP) Operational Linescan System (OLS) for a seven month period in 1994/1995, from an ESRI Digital Chart of the World (DCW), and Tactical Pilotage Charts (TPC). This is described in Methodologies to Improve Global Population Estimates in Urban and Rural Areas, Pozzi, Balk, Yetman, Nelson, Deichmann, 2003. More detailed discussion is available in The Distribution of People and the Dimension of Place: Methodologies to Improve the Global Estimation of Urban Extents, Balk, Pozzi, Yetman, Deichmann, Nelson, 2005
Population data were gathered primarily from official statistical offices (census data) and secondarily from other web sources, such as Gazetteer (www.gazetteer.de) and CityPop (www.citypop.de), or from specific individual databases when official statistical databases were not available. Based on the data available and applying UN growth rates, we estimated population in 1990, 1995, and 2000. In some cases, the records for cities and town included latitude and longitude coordinates. For those where coordinates were not available, we matched the settlement name and administrative units with the National Imagery and Mapping Agency (NIMA) database of populated places (gnswww.nima.mil/geonames/GNS/index.jsp).
The resulting database constitutes what we will call “points”.
b Settlements extent
The physical extent of settlements has been derived both from raster and vector datasets, in
• Night-time lights, produced using time series data from the Defense Meteorological Satellite Program (DMSP) Operational Linescan System (OLS) for the period 1 October 1994 to 30 April 1995, where the pixel values are measurements of the frequency with which lights were observed normalized by the total number of cloudfree observations. To delineate the physical extent of human settlements we used the World Stable Lights dataset (“cities” component).
• Digital Chart of the World (DCW)’s Populated Places: an ESRI product originally developed for the US Defense Mapping Agency (DMA) using DMA data and currently available at 1:1,000,000 scale (1993 version). The “populated places” coverage is available for most countries and contains depictions of the urbanized areas (built-up areas) of the world that are represented as polygons at 1:1,000,000 scale.
• Tactical Pilotage Charts (TPC): standard charts produced by the Australian Defense Imagery and Geospatial Organization, at a scale of 1:500,000, originally designed to provide an intermediate scale translation of cultural and terrain features for pilots/navigators flying at very low altitudes. Each chart contains information on cultural, drainage/hydrography, relief, distinctive vegetation, roads, sand ridges, power lines, and topographical features. Settlements are reported both as polygons and points. Polygons and points were digitized for a number of countries,
The GRUMP data was retrieved as an ascii file from the GWP web site. A simple Perl script was used to loop through the station data in the GHCNv2 file, extract the latitude and longitude, and use them to locate the GRUMP undefined/rural/urban values (0,1,2) in the GRUMP ascii data file to determine rural/urban extents for each of the stations. A half-gridsize offset was applied to place the GRUMP grid points into the center of the each cell. In addition, similar scripts were used to parse through the GRUMP data and extract all the urban values for display.
The GRUMP urban extents in CONUS and some surrounding regions. are displayed against a black background to create a ‘faux’ brightness map. This is compared to the NOAA/DMSP brightness map for CONUS
The GRUMP urban extents for the world are displayed against a black background to create a ‘faux’ brightness map. This is compared to the NOAA/DMSP brightness map for world.
In the GHCNv2 v2.temperature.inv file, there are 7280 stations. Of these, 1959 are marked “Urban”, 1409 marked as “Small Town”, 3912 marked as “Rural”, and 0 with no designation.
Calculating the GRUMP designators for the GHCN v2 stations, 3549 are marked “Urban” and 3249 are marked “Rural”. There are 482 undefined stations.
Comparing the GHCNv2 values with GRUMP extent we get a significant mismatch on the Rural/Urban designations.
In the following figure, the stations in which [GHCNv2=rural,GRUMP=urban] are marked red. The station in which [GHCNv2=urban,GRUMP=rural] are marked blue. Undefined stations are marked in green. All other GHCNv2 stations are marked yellow.
I’m surprised that the GRUMP data set is using 15 year old DMSP imagery. DMSP OLS night light images are available up to 2008.
Roughly 5% of the GHCNv2 stations are undefined in the GRUMP extent data set. Possible source for errors lie in the latitude/longitude listed in GHCNv2, missing data in the GRUMP data set, resolution errors in GRUMP dataset, and gridding offset errors in the data lookup routines. Many of the undefined stations are located near water features and are likely ‘lost’ due to erroneous location designations within the water feature.
Excluding undefineds, 11% of the GHCNv2 urban designations and 23% of the GHCNv2 rural designations are not confirmed in the GRUMP data set.
Balk, Pozzi, Yetman, Deichmann, Nelson, The Distribution of People and the Dimension of Place: Methodologies to Improve the Global Estimation of Urban Extents, 2005
Center for International Earth Science Information Network (CIESIN), Columbia University; International Food Policy Research Institute (IFPRI); The World Bank; and Centro Internacional de Agricultura Tropical (CIAT). 2004. Global Rural-Urban Mapping Project (GRUMP), Alpha Version: Urban Extents. Palisades, NY: Socioeconomic Data and Applications Center (SEDAC), Columbia University. Available at http://sedac.ciesin.columbia.edu/gpw 2010 Mar 13
Peterson, Vose, An Overview of the Global Historical Climatology Network Temperature Database, 1997.
Pozzi, Balk, Yetman, Nelson, Deichmann, Methodologies to Improve Global Population Estimates in Urban and Rural Areas, 2003.