Data Description: Extract Surface Water Features from Elevation Data

This challenge focuses on extracting surface water features (hydrography) from elevation data and quantifying the associated uncertainties. Participants will utilize several publicly available U.S. Geological Survey (USGS) and partner datasets, including the National Hydrography Dataset (NHD), 3D Elevation Program (3DEP) products, and high-resolution aerial imagery. These datasets will serve as inputs for both training and inference tasks. Below is an overview of the specific datasets, their sources, coverage areas, and data formats, along with guidance on selection and preparation.

Study Areas

The challenge targets HU12 watersheds in Alaska where prior experiments using U-net models have demonstrated strong predictive performance. Participants must focus on these areas to ensure reproducibility and to facilitate comparison of results.

Selected HU12 Watersheds (HUC12 Codes):

190504010608, 190504011106, 190504011306, 190504011307, 190504011509, 190504011701,
190504011702, 190504011703, 190504011704, 190504011705, 190504011707, 190504011710  

The shapefiles for watersheds' boundaries, flowline and waterbody can be downloaded here.


Reference Data

National Hydrography Dataset (NHD)

Fig. 1 The steps to select NHD in The National Map Downloader Tool

The National Hydrography Dataset (NHD) provides comprehensive digital spatial data of surface water features, including streams, rivers, lakes, and other waterbodies.

Source and Access: The National Map Downloader

  • From the interface, select the “Hydrography” data theme and specify the HU12 watersheds of interest.

Important Note:
Some areas in the NHD are classified as complex channel features with the feature code FCODE = 53700. These represent intricate channelized systems rather than discrete waterbodies. Participants should exclude these FCODE=53700 features from both training and prediction. The objective is to focus on well-defined hydrography features that can be represented as continuous water surfaces in raster and vector datasets.

Format and Data Model:

  • NHD data are commonly provided as vector datasets (e.g., shapefiles, file geodatabases).
  • Participants may need to convert these vector features into raster masks or line/area representations to align them with the elevation and imagery datasets for training deep learning models.

Input Data Sources (Recommended)

1. 3DEP Elevation Data (IfSAR DEM/DSM)

Fig. 2 The steps to select the DEM, and IfSAR derived ORI, and, DSM in The National Map Downloader Tool.

High-quality elevation data are fundamental for deriving hydrography. The challenge leverages digital elevation models (DEMs) and digital surface models (DSMs) derived from Interferometric Synthetic Aperature Radar data to facilitate robust terrain-based hydrography extraction. IfSAR DEMs and DSMs are data products of the USGS 3D Elevation Program (3DEP).

3D Elevation Program (3DEP):

Description: A flagship elevation data source produced by USGS, typically LiDAR- or IfSAR-based in Alaska, providing high-resolution topography.

Interferometric Synthetic Aperture Radar (IfSAR) Data:

Description: The Interferometric Synthetic Aperture Radar (IfSAR) elevation data typically includes a 5-meter resolution digital elevation model (DEM) and digital surface model (DSM) that cover large regions of Alaska. The bare-earth DEM represents the ground surface elevation by removing above-ground features, while the first-return DSM incorporates elevated features such as vegetation and human-made structures.

  • Source and Access: National Map Downloader
    • Under the elevation theme, select IfSAR DEM, DSM, and associated intensity data as needed.

Format and Data Model:

  • DEM/DSM products are typically delivered as raster (e.g., GeoTIFF).
  • Participants are encouraged to generate derivative products (e.g., slope, aspect, curvature, flow accumulation) to enhance model performance.

 

2. High-Resolution Orthoimagery

Fig. 3 The steps to select NAIP dataset in Earth Explorer tool.

High-resolution aerial imagery, such as the National Agriculture Imagery Program (NAIP) dataset, can serve as valuable supplemental data for identifying and refining hydrographic features. These images may help reduce uncertainty by providing additional spectral and textural cues.

Source and Access: USGS EarthExplorer

  • Data Requirements:
    1. Users must create a free account.
    2. Search for available NAIP or similar high-resolution ortho imagery over the target HU12 watersheds.

Format and Data Model:

  • Imagery is typically delivered as georeferenced orthophotos (e.g., GeoTIFF).
  • Participants can select suitable bands (e.g., RGB, NIR) and spatial resolutions for their models.

 

3. Intensity Data (IfSAR ORI)

Ortho-rectified radar intensity (ORI) data from IfSAR acquisitions provide an alternative spectral domain that can complement optical imagery and elevation products. The ORI data from the USGS 3DEP can highlight certain surface features and help distinguish water from non-water surfaces.

Source and Access: National Map Downloader

  • Included as part of IfSAR data downloads from the National Map (Fig. 2).

Format and Data Model:

  • Delivered as raster GeoTIFFs.

 

Use of Additional Publicly Available Data

Participants are permitted, and even encouraged, to use other publicly available datasets that may enhance their hydrography extraction models. Such data might include, for example, global water masks, remote sensing products from other governmental or research agencies, open aerial imagery archives, or ancillary environmental variables. However, any additional datasets must be:

  • Publicly Accessible: Ensure that all utilized datasets are freely available without subscription or licensing barriers.
  • Fully Documented: Provide clear documentation, including data source URLs, search parameters, and acquisition procedures.
  • Reproducible Processing: Describe all data pre-processing steps, reprojection, resampling, or format conversions required to align these additional datasets with the primary challenge data sources. This ensures transparency and reproducibility for all participants and evaluators.

Data Preparation Guidance

Derivatives and Pre-processing Steps:
Participants are expected to generate their own raster derivatives from the elevation data. Common steps include creating flow direction and accumulation rasters, slope/curvature layers, and masking out non-relevant areas. Similarly, vector-based hydrography data may need conversion to raster masks to align with elevation/imagery layers.

Quality Control and Alignment:
Ensure proper spatial alignment, consistent coordinate reference systems, and appropriate temporal matches between datasets. Elevation and imagery data should be co-registered before training.


Uncertainty Estimation

In addition to feature extraction, the challenge emphasizes quantifying uncertainty. Consider methods such as probabilistic modeling, confidence scoring, or ensemble approaches. Integrate these methods at both the data and model levels, taking into account the data resolution, sensor variability, and quality differences that may affect predictive confidence.

Share this: