2.1. Landsat ARD Methodology
The Landsat Analysis Ready Data (ARD) developed by the Global Land Analysis and Discovery team (GLAD) provides globally consistent inputs for land cover mapping and change detection. The entire archive of the Landsat Collection 1 TM/ETM+/OLI data available from the United States Geological Survey National Center for Earth Resources Observation and Science (USGS EROS) was automatically processed to derive ARD time-series. The following section documents the Landsat data processing and temporal aggregation. For all questions and comment contact Peter Potapov (potapov@umd.edu). The following publication summarizes data processing steps: Potapov et al., 2019 (Potapov_RSE_2019.pdf)
The Landsat data processing includes four steps: (1) conversion to radiometric quantity, (2) observation quality assessment, (3) reflectance normalization, and (4) temporal integration into 16-day composites.
1. Conversion to radiometric quantity
In the first step, all data is converted to top-of-atmosphere reflectance (Chander et al., 2009) for reflective bands and brightness temperature for the emissive band. Spectral reflectance (value range from zero to one) are rescaled to the range from 1 to 40,000; temperature is recorded as degrees C multiplied by 100 to preserve measurement precision. Only spectral bands with matching wavelengths between TM, ETM+ and OLI/TIRS sensors are processed (Table 1).
Table 1. Landsat spectral bands used for the ARD and corresponding MODIS spectral bands
Band name |
Wavelength, nm |
|||
Landsat 5 TM |
Landsat 7 ETM+ |
Landsat 8 OLI/TIRS |
MODIS |
|
Blue |
450-520 |
441-514 |
452-512 |
459-479 |
Green |
520-600 |
519-601 |
533-590 |
545-565 |
Red |
630-690 |
631-692 |
636-673 |
620-670 |
NIR |
760-900 |
772-898 |
851-879 |
841-876 |
SWIR1 |
1,550-1,750 |
1,547-1,749 |
1,566-1,651 |
1,628-1,652 |
SWIR2 |
2,080-2,350 |
2,064-2,345 |
2,107-2,294 |
2,105-2,155 |
Thermal |
10,410-12,500 |
10,310-12,360 |
10,600-11,190 |
10,780-11,280 |
The following equations summarize data conversion and scaling. The value of PI defined as 3.1415926535897932384626433832. All other coefficients are from Landsat Collection 1 metadata and lookup tables from Chander et al., 2009. DN stands for the source raster value.
Landsat 8 OLI
From the image metadata:
SUNELEV = "SUN_ELEVATION"
Landsat 8 TIRS
Brightness temperature = (K2/log(K1/(0.0003342*DN+0.1)+1))*100+0.5
From the image metadata:
K1 = “K1_CONSTANT_BAND_10”
K2 = “K2_CONSTANT_BAND_10”
Landsat 4/5 TM and 7 ETM+
TOA reflectance = ((PI*D^2*(G*DN+B))/(ESUN*sin(SUNELEV *PI/180)))*40000
From the image metadata:
SUNELEV = “SUN_ELEVATION”
From the lookup tables (Chander et al., 2009):
G – gain (sensor- and band-specific)
B – bias (sensor- and band-specific)
ESUN – exoatmospheric irradiance (sensor- and band-specific)
D – Sun-Earth distance
Brightness temperature = (K2/log(K1/(G*DN+B)+1))*100+0.5
From the lookup tables (Chander et al., 2009):
K1 = Thermal band calibration constant 1 (sensor-specific)
K2 = Thermal band calibration constant 2 (sensor-specific)
2. Observation quality assessment
During the second step, we determine per-pixel observation quality, i.e. a probability of an image pixel to be collected during clear sky conditions. The GLAD quality assessment model developed by our team represents a set of regionally adapted decision trees to map probability of a pixel to represent cloud, cloud shadow, heavy haze, and, for clear-sky observations, land, water, or snow/ice. The Landsat Collection 1 data include observation quality layers based on the globally consistent CFMask cloud and cloud shadow detection algorithm (Foga et al., 2017). Since our primary goal is to reduce the presence of clouds and shadows in the time-series data, we merge both the CFMask product (high-probability clouds and shadows) with the GLAD algorithm output. Through these outputs, cloud, shadow, haze, water, and land masks are created for each Landsat image. The masks are subsequently aggregated into a single quality flag that highlights cloud/shadow contaminated observations, indicates confidence of cloud/shadow detection, separates topography shadows from likely cloud shadows, and specifies the proximity to high-confidence clouds and cloud shadows. Table 2 summarize possible observation quality flags. For the subsequent analysis, quality flags 1, 2, 5, 6, 11, 12, and 14 are treated as clear-sky observations, while codes 3, 4, 7, 8, 9, and 10 are considered contaminated by clouds and shadows. Codes 5, 6, 11, 12, and 14 are used for 16-day composition to prioritize observations with minimal atmospheric contamination.
Table 2. Per-pixel observation quality flags
Flag |
Quality |
Definition |
1 |
Land |
Clear-sky land observation. |
2 |
Water |
Clear-sky water observation. |
3 |
Cloud |
Cloud detected by either GLAD or CFMask algorithm. |
4 |
Cloud shadow |
Shadow detected by either GLAD or CFMask algorithm. The pixels located within the projection of a detected cloud. Cloud projection defined using the solar elevation and azimuth and limited to 9 km distance from the cloud. |
5 |
Topographic shadow |
Shadow detected by either GLAD or CFMask algorithm. The pixel located outside cloud projections and within estimated topographic shadow (defined using SRTM DEM and the observation solar elevation and azimuth). |
6 |
Snow |
Clear-sky observation classified as snow by the GLAD algorithm |
7 |
Haze |
Classified as haze (dense semi-transparent cloud) by the GLAD algorithm. |
8 |
Cloud proximity |
Aggregation (OR) of two rules: (i) 1-pixel buffer around detected clouds. (ii) Above-zero cloud likelihood (estimated by GLAD cloud detection model) within 3-pixel buffer around detected clouds. |
9 |
Shadow proximity |
Shadow likelihood (estimated by GLAD shadow detection model) above 10% for pixels either (i) located within the projection of a detected cloud; OR (ii) within 3 pixels of a detected cloud or cloud shadow. |
10 |
Other shadows |
Shadow detected by either GLAD or CFMask algorithm. The pixel located outside (a) projection of a detected cloud and (b) estimated topographic shadow. |
11 |
Additional cloud proximity over land |
Clear-sky land pixels located closer than 7 pixels of detected clouds |
12 |
Additional cloud proximity over water |
Clear-sky water pixels located closer than 7 pixels of detected clouds |
14 |
Additional shadow proximity over land |
Clear-sky land pixels located closer than 7 pixels of detected cloud shadows |
3. Reflectance normalization
The third step consists of reflectance and brightness temperature normalization to reduce the effects of atmospheric scattering and surface anisotropy. The purpose of relative normalization is to simplify extrapolation of classification models in space and time by ensuring spectral similarity of the same land-cover types. Relative normalization is not computationally heavy and does not require synchronously collected or historical data on atmospheric properties and land-cover specific anisotropy. The normalized surface reflectance is not equal to surface reflectance and should not be used as a source dataset for the analysis of reflectance properties and dynamic. The purpose of the ARD product is solely to facilitate land cover and land cover change mapping.
The Landsat image normalization consists of four steps: (A) Production of the normalization target dataset; (B) selection of pseudo-invariant target pixels to derive the normalization model; (C) model parametrization; and (D) model application.
A. Normalization target
The normalization target data were collected from the MODIS 44C surface reflectance product. We used MODIS bands with a similar wavelength to the selected Landsat bands (see Table 1). MODIS surface reflectance and brightness temperature were scaled using the same coefficients as for the processed Landsat data. To ensure spatial consistency of reflectance target dataset, we used all 16-day global MODIS observation composites from the year 2000 to 2011. To reduce the effects of atmospheric contamination and surface anisotropy, only near-nadir clear-sky and low aerosol observations were retained in the time-series. From all selected observations, a single reflectance composite was created for each spectral band. We calculated the normalization target composite value as the mean reflectance of observations with normalized difference vegetation index (NDVI) above the 75% percentile.
B. Selection of pseudo-invariant target pixels
We define the pseudo-invariant target pixels as clear-sky land observations that represent the same land cover type and phenology stage in Landsat image and MODIS normalization target composite. To check for the land cover type and condition, we calculate the absolute difference between Landsat and MODIS spectral reflectance for red and shortwave infrared bands and select pixels with difference below 0.1 reflectance value for both spectral bands. Bright objects (with red band reflectance above 0.5) are excluded from the mask. Landsat images with less than 10,000 selected pseudo-invariant target pixels are discarded from the processing chain.
C. Model parametrization
To parametrize the reflectance normalization model, we calculate the median bias between MODIS and Landsat reflectance of pseudo-invariant pixels for each reflective band for each 10 km interval of distance from the Landsat ground track. The set of median values is used to parametrize a per-band linear regression model using least squares fitting method. For each image and each spectral band, we derive Gain and Bias coefficients to predict the reflectance bias as a function of the distance from the ground track:
[Reflectance Bias] = Gain * [Distance from ground track] + Bias
For images where only a small portion (less than 1/16 of the image) is covered by land, a mean reflectance bias is calculated instead (Gain = 0). Such conditions are usually found in coastal regions or over small islands. For the brightness temperature band, we calculate a single mean bias value for all pseudo-invariant target pixels within the image.
D. Model application
The model is then applied to all pixels within the image to estimate the reflectance (brightness temperature) bias. We subtract the estimated reflectance bias from the Landsat top-of-atmosphere reflectance for each spectral band and from brightness temperature values. The model has been shown to effectively normalize land surface reflectance and reduce effects of atmospheric scattering and surface anisotropy (Potapov et al., 2012)
4. Temporal integration into 16-day composites
The final step of Landsat ARD processing is temporal aggregation of individual images into 16-day composites. The range of dates for each composite is provided in the Table 3.
Table 3. 16-day composite intervals
Interval ID |
DOY start |
DOY end |
Date start |
Date end |
1 |
1 |
16 |
1-Jan |
16-Jan |
2 |
17 |
32 |
17-Jan |
1-Feb |
3 |
33 |
48 |
2-Feb |
17-Feb |
4 |
49 |
64 |
18-Feb |
4-Mar |
5 |
65 |
80 |
5-Mar |
20-Mar |
6 |
81 |
96 |
21-Mar |
5-Apr |
7 |
97 |
112 |
6-Apr |
21-Apr |
8 |
113 |
128 |
22-Apr |
7-May |
9 |
129 |
144 |
8-May |
23-May |
10 |
145 |
160 |
24-May |
8-Jun |
11 |
161 |
176 |
9-Jun |
24-Jun |
12 |
177 |
192 |
25-Jun |
10-Jul |
13 |
193 |
208 |
11-Jul |
26-Jul |
14 |
209 |
224 |
27-Jul |
11-Aug |
15 |
225 |
240 |
12-Aug |
27-Aug |
16 |
241 |
256 |
28-Aug |
12-Sep |
17 |
257 |
272 |
13-Sep |
28-Sep |
18 |
273 |
288 |
29-Sep |
14-Oct |
19 |
289 |
304 |
15-Oct |
30-Oct |
20 |
305 |
320 |
31-Oct |
15-Nov |
21 |
321 |
336 |
16-Nov |
1-Dec |
22 |
337 |
352 |
2-Dec |
17-Dec |
23 |
353 |
366 |
18-Dec |
31-Dec |
Temporal compositing is done per-pixel using all overlapping observations within the 16-day interval. From all available observations we retain the one with the highest observation quality. The proximity to clouds/shadows is used as a criterion to select the least contaminated observation. If several clear-sky observations were available, the per-band mean reflectance value is retained in the composite. Each 16-day composite consists of normalized surface reflectance for six spectral bands (blue, green, red, NIR, and two SWIR), normalized brightness temperature, and the quality flag as separate raster layers. The 16-day interval composites are stored in the geographic coordinates (+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs) as raster tiles of 1.0005 by 1.0005 degree size and pixel size of 0.00025 degree. The tile system features a 2-pixel overlap to simplify parallelization of the focal average computation. For detailed information about the Landsat ARD see Landsat ARD Structure section.
Important notes on data limitations:
- Images that were not processed to Tier 1 standard, as well as images with high cloud cover and Landsat 5 TM sensor issues were automatically detected and excluded from processing.
- Due to low sun azimuth and similarity between snow cover and clouds during winter season in temperate and boreal climates, the GLAD Landsat ARD algorithm is not suitable for winter time image processing above 30N and below 45S Latitude. Using the MODIS snow and ice product and NDVI time-series we defined dates with high probability of the seasonal snow cover for each WRS2 path/row. We are not processing images for these dates, and the resulting 16-day intervals have no data. Some of the images (and resulting 16-day composites) may still include snow-covered observations. We suggest to further filter such observations using “snow/ice” quality flag or by removing certain 16-day intervals from data processing.
- The GLAD team is committed to the annual ARD update. Due to the delay of Tier 1 data processing by USGS EROS (between 14 and 26 days after the data acquisition) and the time required to download and process new data by GLAD, the current data processing system is not designed for real-time ARD delivery.
- The users should be aware that the image normalization method is not design to deal with specular reflectance and thus introduces bias over the water surfaces.