Deriving a Global Urban Extent Map from ENVISAT ASAR Wide Swath Mode
Schmullius, Christiane1; Pathe, Carsten1; Riedel, Tanja1; Gamba, Paolo2; Lisini, Gianni2; Santoro, Maurizio3
1University of Jena, GERMANY; 2University of Pavia, ITALY; 3GAMMA Remote Sensing, SWITZERLAND

Human activities on Earth are rapidly altering our environment. Ongoing urbanization is a global phenomenon. For monitoring such changes and to understand global environmental change, consistent global land cover maps are urgently needed. Such maps can be derived from remote sensing data and methods.
An urban extent map is being derived from multi-temporal ENVISAT ASAR Wide Swath Mode (WSM) data at a final resolution of 300 m. Such a map could be useful as supplemental or alternative information source for global land cover products such as Globcover. Radar multi temporal backscatter statistics are indeed a very helpful tool to analyze the temporal variation of the backscatter information contained in an image. Because of the relatively static characteristic of man-made urban objects and surfaces, urban areas are objects with little temporal change in terms of radar backscatter, as long as the viewing geometry is unaltered. Therefore, urban areas are characterized by high backscatter values in multi-temporal radar images and can be extracted by appropriate mapping procedures. The global WSM data used in this work have been acquired throughout 2010; for areas with low temporal coverage the time frame has been expanded to 2009-2012. Pre-processing of the WSM data comprises geocoding, radiometric calibration and normalization to a mean incidence angle of 30°. To reduce speckle effects on the geocoded, calibrated and normalized ASAR WSM images, a multi-temporal speckle filtering technique was also applied to the data. Finally, the image stack was used to generate a global multi-temporal mean backscatter data set for further analysis and urban extent extraction.
The final urban extent map is derived from the ENVISAT ASAR WSM data using a suitable combination of two mapping procedures, the Urban Area Detecting Parameter (UADP) and the Urban EXTraction (UEXT) procedures.
The UADP is based on a Neighborhood Greylevel Dependency matrix (NGLD) that models the co-occurrence of the pixel´s intensity and properties of its surrounding pixels under a neighborhood constraint. The relation used for the computation of the NGLD-matrix for the urban extent mapping is a difference of 0.1 between neighboring pixels within a distance of one and two pixels (both distances were considered). All non-urban areas in the resulting UADP map are characterized by the value 0. To create a simple binary UADP urban mask, all pixels with values > 0 are set to 1, all non-urban areas will keep the value 0. There are some limitations to the UADP mapping procedure. Problems occur in detecting the closeness of bigger settlements. The UADP measure produces differences in grey levels but is not sensitive enough to relatively homogenous high backscatter values in bigger cities. Therefore gaps are likely to occur in the bigger urban areas. Arid regions, where bare ground and rocks cause high backscatter values, may also lead to false mapping results. Another problem arises in areas with steep slopes, Where the terrain facing the ASAR sensor are characterized by high multi-temporal backscatter values which appear as bright as an urban area areas. Due to these limitations, the UADP map requires a number of post-processing steps. Possible gaps within large city areas are treated by a morphological filtering procedure using a combination of expand and shrink filters. To account for misclassified pixels in arid areas, a combination of the Globcover bare area class (class value 200) and the version 4 DMSP-OLS Nighttime Lights Time Series product are used to identify and correct such areas and misclassified pixels. Steep slope areas exceeding 15° are masked out using a slope map derived as a side product during the geocoding process.
The UEXT procedure involves three steps: pre-processing, UEXT computation and post-processing. The current version of the procedure does not use any ancillary data, and by this way, the computational load is only taken by the UEXT extraction routines and its overall iterative approach. Pre-processing in the UNIPV processing chain is performed exactly in the same way as described above while the extraction routine of the urban extent is based on a processing chain summarized by the following steps:
1) Seed extraction: bright pixels with a large backscatter value are selected as seeds of the procedure. The threshold required by this step is quite stable and has been selected according to the experience of the UNIPV team to a value equal to 0.65. Although this value is fine for most areas of the world, it may be reduced to values as low as 0.35 in arid areas or increased to as much as 0.85 in highly vegetated areas.
2) Region growing: the second step is a region growing procedure starting from the seed pixels and including in the urban mask those pixels that, in a widow around the seeds (at each iteration) have a backscatter value higher than a pre-defined value. With a spatial resolution of 75 m, windows must be kept at the smallest possible value to detect compact areas, and therefore windows of 7 x 7 pixels are considered (with almost no difference in the results, although the smaller one is usually preferable). The threshold value is instead an important parameter, and it has to be selected manually in the range between 0.15 and 0.3.
The region growing procedure iterates until no additional pixels are added to the mask.
After the urban extent mapping using the UADP and UEXT mapping procedures, the results of both algorithms are combined for areas where they contain complementary information and aggregated to a spatial resolution of 300 m to form the final global urban extent map.

Finally, the validation of the combined final follows a scheme already applied for the existing Globcover global land cover product and based on freely available on-line VHR remote sensing and GIS data (e.g., GoogleEarth, BingMaps, and Open Street Map) as reference. A sufficient number of validation pixels representing areas of 300m x 300m are globally distributed over land areas. These pixels are selected using a stratified random sampling procedure considering the "urban" and "non-urban" classes. Each pixel is inspected manually and compared to the reference data set. If the validation pixel is covered to more than xx% by urban features according to the reference map, it is assigned with the value "urban", if not with the value "non-urban". These reference pixels are then compared to the final urban extent map described above and the accuracy assessment is performed.