4 min read

Expanded AI Model with Global Data Enhances Earth Science Applications 

A false-color image of the East Peak fire burning in southern Colorado near Trinidad, captured by the Operational Land Imager on the Landsat 8 satellite. Dark green forests and light green grasslands cover most of the image, but a red patch in the middle represents a burn scar, and some orange spots around it represent actively burning areas.
On June 22, 2013, the Operational Land Imager (OLI) on Landsat 8 captured this false-color image of the East Peak fire burning in southern Colorado near Trinidad. Burned areas appear dark red, while actively burning areas look orange. Dark green areas are forests; light green areas are grasslands. Data from Landsat 8 were used to train the Prithvi artificial intelligence model, which can help detect burn scars.
NASA Earth Observatory

NASA, IBM, and Forschungszentrum Jülich have released an expanded version of the open-source Prithvi Geospatial artificial intelligence (AI) foundation model to support a broader range of geographical applications. Now, with the inclusion of global data, the foundation model can support tracking changes in land use, monitoring disasters, and predicting crop yields worldwide. 

The Prithvi Geospatial foundation model, first released in August 2023 by NASA and IBM, is pre-trained on NASA's Harmonized Landsat and Sentinel-2 (HLS) dataset and learns by filling in masked information. The model is available on Hugging Face, a data science platform where machine learning developers openly build, train, deploy, and share models. Because NASA releases data, products, and research in the open, businesses and commercial entities can take these models and transform them into marketable products and services that generate economic value. 

“We’re excited about the downstream applications that are made possible with the addition of global HLS data to the Prithvi Geospatial foundation model. We’ve embedded NASA’s scientific expertise directly into these foundation models, enabling them to quickly translate petabytes of data into actionable insights,” said Kevin Murphy, NASA chief science data officer. “It’s like having a powerful assistant that leverages NASA’s knowledge to help make faster, more informed decisions, leading to economic and societal benefits.”

AI foundation models are pre-trained on large datasets with self-supervised learning techniques, providing flexible base models that can be fine-tuned for domain-specific downstream tasks.

Three images show the process of crop classification with NASA and IBM’s open-source Prithvi Geospatial artificial intelligence model. The first shows a true color composite image of a cropland area. The second shows a Ground Truth Mask of the types of land cover in the image — natural vegetation is colored red, forest is orange, corn is yellow, soybeans are light green, wetlands are mid-green, developed or barren land is darker green, open water is light blue, winter wheat is mid-blue, alfalfa is dark blue, fallow/idle cropland is dark purple, cotton is pink, and sorghum is dark pink. The third image shows a Predicted Mask that closely matches the Ground Truth Mask.
Crop classification prediction generated by NASA and IBM’s open-source Prithvi Geospatial artificial intelligence model.

Focusing on diverse land use and ecosystems, researchers selected HLS satellite images that represented various landscapes while avoiding lower-quality data caused by clouds or gaps. Urban areas were emphasized to ensure better coverage, and strict quality controls were applied to create a large, well-balanced dataset. The final dataset is significantly larger than previous versions, offering improved global representation and reliability for environmental analysis. These methods created a robust and representative dataset, ideal for reliable model training and analysis. 

The Prithvi Geospatial foundation model has already proven valuable in several applications, including post-disaster flood mapping and detecting burn scars caused by fires.

One application, the Multi-Temporal Cloud Gap Imputation, leverages the foundation model to reconstruct the gaps in satellite imagery caused by cloud cover, enabling a clearer view of Earth's surface over time. This approach supports a variety of applications, including environmental monitoring and agricultural planning.  

Another application, Multi-Temporal Crop Segmentation, uses satellite imagery to classify and map different crop types and land cover across the United States. By analyzing time-sequenced data and layering U.S. Department of Agriculture’s Crop Data, Prithvi Geospatial can accurately identify crop patterns, which in turn could improve agricultural monitoring and resource management on a large scale. 

The flood mapping dataset can classify flood water and permanent water across diverse biomes and ecosystems, supporting flood management by training models to detect surface water. 

Wildfire scar mapping combines satellite imagery with wildfire data to capture detailed views of wildfire scars shortly after fires occurred. This approach provides valuable data for training models to map fire-affected areas, aiding in wildfire management and recovery efforts.

Three images show the process of burn scar mapping with NASA and IBM’s open-source Prithvi Geospatial artificial intelligence model.The first shows a true color composite satellite image, which contains shades of green and purple. The second shows a Ground Truth Mask, which shows the true extent of a burn scar on the land by blacking out the area around the burn scar. The third shows a Predicted Mask, which almost exactly matches the Ground Truth Mask.
Burn scar mapping generated by NASA and IBM’s open-source Prithvi Geospatial artificial intelligence model.

This model has also been tested with additional downstream applications including estimation of gross primary productivity, above ground biomass estimation, landslide detection, and burn intensity estimations. 

“The updates to this Prithvi Geospatial model have been driven by valuable feedback from users of the initial version,” said Rahul Ramachandran, AI foundation model for science lead and senior data science strategist at NASA’s Marshall Space Flight Center in Huntsville, Alabama. “This enhanced model has also undergone rigorous testing across a broader range of downstream use cases, ensuring improved versatility and performance, resulting in a version of the model that will empower diverse environmental monitoring applications, delivering significant societal benefits.”

The Prithvi Geospatial Foundation Model was developed as part of an initiative of NASA’s Office of the Chief Science Data Officer to unlock the value of NASA’s vast collection of science data using AI. NASA’s Interagency Implementation and Advanced Concepts Team (IMPACT), based at Marshall, IBM Research, and the Jülich Supercomputing Centre, Forschungszentrum, Jülich, designed the foundation model on the supercomputer Jülich Wizard for European Leadership Science (JUWELS), operated by Jülich Supercomputing Centre. This collaboration was facilitated by IEEE Geoscience and Remote Sensing Society.  

For more information about NASA’s strategy of developing foundation models for science, visit https://science.nasa.gov/artificial-intelligence-science.