Summer School 2025 Projects : I-GUIDE

Project Description: India faces escalating risks from hazards such as heat waves, drought, flooding, air pollution, and power outages—threats that disproportionately affect vulnerable communities across its diverse and densely populated landscape. Despite growing exposure, little is known about how these risks are perceived at the individual or district level. This project will use artificial intelligence and machine learning to build individual-level models of public risk perceptions for the first time in India. These models will integrate national survey data with climate exposure and socio-demographic variables to predict perceptions of various hazard and weather-related threats. Ultimately, the results will feed into multilevel regression with poststratification (MRP) models to produce district-level maps of risk perception across the country. During the project, students will focus on the first phase: developing, validating, and quantifying uncertainty in machine learning models of individual-level risk perception. Their work will lay the foundation for broader efforts to understand spatial variation in concern about these hazards and inform more targeted communication and policy responses.

Data Likely to be Used:

Perceived Risk Data: Yale Program on Climate Change Communication (2023-2024), county-level estimates of wildfire concern.
India Meteorological Department (IMD) – Gridded Temperature Data
ERA5 Reanalysis (ECMWF via Copernicus)
MODIS Land Surface Temperature (LST)
Indian Institute of Tropical Meteorology (IITM) – Climate Hazard Indices
Standardized Precipitation Index (SPI) – IMD
CHIRPS (Climate Hazards Group InfraRed Precipitation with Station Data)
MODIS NDVI / Vegetation Health Index (VHI)
Soil Moisture Active Passive (SMAP) or ESA CCI Soil Moisture

Technologies Likely to be Used:

Jupyter Notebooks (via the I-GUIDE Platform)
Python
Pandas, NumPy, SciPy, Scikit-learn, Matplotlib, Seaborn, GeoPandas
QGIS or ArcGIS for advanced spatial analysis and mapping
Statistical software or Python libraries for regression analysis, hypothesis testing, and model validation

Team Leader: Jennifer Marlon, Executive Director of the Yale Center for Geospatial Solutions and Senior Research Scientist, Yale University (university, Google Scholar, LinkedIn)

Keywords: wildfires, burn areas, GeoAI, Pantanal, remote sensing
Project Description: This project focuses on understanding and predicting wildfire risk in Brazil’s Pantanal wetlands, which in 2024 experienced the most intense early-season fires on record. Students will address two key questions: (1) How do land use changes and environmental extremes (e.g., drought, wind, temperature) influence the spatial extent of burn areas? (2) Can spatiotemporal data integration and GeoAI methods improve early-stage wildfire risk mapping? During the week, students will:

Visualize and quantify recent burn areas using satellite data;
Explore correlations between burn areas and environmental conditions using spatial overlays and summary statistics;
Use Jupyter Notebooks to prototype a spatiotemporal model (e.g., threshold-based or regression-based) linking fire severity to factors like Daily Severity Rating (DSR), drought index, and land cover change;
Generate maps and interpret spatial patterns to support risk analysis.

MODIS and VIIRS burned area products (NASA FIRMS)
MapBiomas Amazonia Collection 6 (land use/land cover)
Global Forest Watch/Hansen et al. (forest cover change)

Jupyter Notebooks (via the I-GUIDE Platform)
Python
pandas, geopandas, matplotlib, rioxarray, earthpy
Google Earth Engine (via Python API)

Project Description: Students will develop DeepEarth into a geospatial AI model for fire ecology simulation. The team will develop upon open source APIs for self-supervised deep learning of multi-modal ecological datasets across space and time. Their goals will be to (a) statistically and generatively reconstruct landscape wildfires through species-aware and ecology-aware deep neural networks, and/or (b) train AI models for prediction of live fuel moisture content across many plant species, given diverse geospatial, temporal, and ecological constraints. Following a crash course in AI coding tools and all core techniques, team members will divide and conquer in exploring datasets, engineering training pipelines, and performing machine learning experiments, with the goal of producing together a rich table of ablation tests that prove predictive utility of each data source studied. The team will then seek to fuse their work into one DeepEarth fire ecology simulator, in order to support future application by firewise landscape design and management professionals.

Data Likely to be Used:

Pre/post fire aerial LiDAR and visible imaging (e.g. OpenTopography, LA County Recovery)
Live fuel moisture content measurements of plant species (e.g. Globe-LFMC 2.0)
Hourly weather sensing & modeling (e.g. NASA/NOAA NLDAS)
Soil chemistry laboratory surveys (e.g. USDA NRCS Soil Survey)
Biodiversity species occurrence records (e.g. Global Biodiversity Information Facility)

Technologies Likely to be Used:

Jupyter Notebooks (via the I-GUIDE Platform)
Python
PyTorch
AI coding tools (e.g., Claude Code)

Team Leader: Lance Legel, CEO of ecodash.ai (CV, LinkedIn)

Project Description: Extreme heat, grimly referred to as ‘the silent killer,’ is the leading cause of weather-related fatalities in the United States. This is slowly being recognized as a major threat to human life. Hot weather conditions contribute to unhealthy air quality as well. We are witnessing a steady increase in extreme heat events and associated air pollution episodes, which are expected to continue to rise. Not only does extreme heat pose a risk to human life, it also threatens to stymie our efforts to reduce greenhouse gas (GHG) emissions and manage congestion. Shifting trips away from private vehicles and toward alternative, cleaner modes of transportation can be a key mechanism to reduce emissions and congestion from passenger vehicles. However, during heatwaves, people walking and biking are most at risk due to high levels of heat exposure, owing to which many opt to travel by car (contributing to more GHG emissions and congestion) or forgo these trips (leading to unfulfilled mobility needs and, consequently, lower quality of life and well-being). Providing safe and accessible ways for residents to use multimodal transportation generates multiple co-benefits including access to employment and recreation, improved health outcomes, reduced GHG emissions from transportation, and increased transportation resilience in the event of an emergency.The NO-HEAT project aims to support multimodal transportation by making it safer and more comfortable for residents to walk and bike in their communities. We will work to improve heat resilience capacity and enhance climate justice by creating open-source data and decision-support tools that can be used to identify high-risk areas and communities and enable planners and policy-makers to proactively plan for extreme heat mitigation. This project brings together state-of-the-art urban microclimate modeling and innovative, link-level active transportation modeling to create high-resolution heat risk datasets and examine disparities in heat exposure that can help inform targeted heat mitigation interventions.

Data Likely to be Used:

Meteorological data from ERA5
LiDAR data from USGS
Regional Land Cover data from NOAA
Building data from Overture Maps
Mobility data from Replica

Technologies Likely to be Used:

Jupyter Notebooks (via the I-GUIDE Platform)
Python
PyQGIS, qgis.core, gdal, geopandas, shapely, laspy, pandas, rasterstats, rasterio, numpy, scipy, pdal, whitebox_workflows, multiprocessing
QGIS and its UMEP plugin

Team Leader: Rounaq Basu, Assistant Professor at the Georgia Institute of Technology (university, Google Scholar, LinkedIn)

Keywords: flood prediction, geospatial data integration, machine learning, multimodal deep learning, data-driven modeling

Project Description: Flooding is an increasingly destructive hazard globally, driven by the dual pressures of climate change and rapid urbanization. Timely and accurate flood prediction is essential for issuing early warnings, guiding infrastructure planning, and supporting emergency response—ultimately saving lives and reducing economic losses. Traditional flood forecasting models, such as hydrological and hydrodynamic models, are physically interpretable and capable of simulating a variety of scenarios with different spatial and temporal resolutions. However, these models typically require diverse and high-quality input data, are computationally intensive, and lack transferability across regions. In contrast, machine learning offers a data-driven approach that can learn complex spatiotemporal patterns from historical data without the need for explicit physical modeling. It enables fast, flexible, and scalable predictions using trained models and supports integration of various types of input data.

This project will explore the development of an advanced flood prediction framework using machine learning—particularly multimodal deep learning—to fuse multiple publicly available geospatial datasets. Students will work on training models that integrate remote sensing, environmental, and ground-based data. They will conduct ablation studies to assess how each data modality contributes to overall model performance and to better understand inter-modal synergies. Throughout the week, students will:

Preprocess and harmonize different geospatial datasets
Train and test ML models (e.g., CNNs, transformers)
Evaluate model performance through metrics and visualization
Analyze the impact of each modality on prediction accuracy

Data Likely to be Used:

High Water Marks from USGS: Maximum water levels during past flood events
Rainfall Data from the CHIRPS dataset
Groundwater Levels from USGS NWIS
Climate Data from U.S. Hourly Climate Normals (NCEI)
Satellite Imagery: Sentinel-1 (SAR), Sentinel-2 (optical), SMAP (soil moisture)
Digital Elevation Model: Copernicus DEM (GLO-30)

Technologies Likely to be Used:

Jupyter Notebooks (via the I-GUIDE Platform)
PyTorch, torchvision, transformers (deep learning)
OpenCV, Pillow, GDAL, pandas, scikit-learn, NumPy (data processing and analysis)

Team Leader: Wen Zhou, Postdoctoral Research Associate, University of Illinois Urbana-Champaign (Google Scholar, LinkedIn)

Summer School 2025 Projects

Summer School 2025: Spatial AI for Extreme Events and Disaster Resilience

UCAR Campus in Boulder, Colorado · August 4-8, 2025

Bridging the Risk Perception Gaps around Hazards

GeoAI for Burn Area Analysis and Fire Risk Mapping in the Pantanal Wetlands

Geospatial Simulation of Fire Ecology with DeepEarth

Neutralizing Onerous Heat Effects on Active Transportation (NO-HEAT)

Predicting and Mapping Floods through Geospatial Data Fusion and Machine Learning