NASA’s Science Discovery Engine revolutionizes data accessibility using GenAI, simplifying research amidst vast data volumes.

NASA has recently unveiled its Science Discovery Engine (SDE), a remarkable tool powered by GenAI, designed to streamline the search and discovery of its extensive science data. This initiative marks a significant advancement in how researchers access and utilize data for scientific endeavors.

The SDE, developed in collaboration with New York-based Sinequa, integrates advanced neural networks and GenAI technologies. Sinequa, initially known for its semantic search engine, now employs large language models (LLMs) and Microsoft’s Azure OpenAI Service to enhance its neural search capabilities. This combination allows for more contextual and relevant search results, making it easier for scientists to navigate NASA’s colossal data trove.

Kaylin Bugbee, a NASA data scientist, played a pivotal role in this project. With a background in data stewardship and involvement in significant initiatives like Data.gov and President Obama’s Climate Data Initiative, Bugbee emphasizes the importance of robust curation workflows in managing such vast quantities of data. Her team’s dedication over the past year to understanding the information landscape has been crucial in bringing this project to fruition.

The SDE understands nearly 9,000 scientific terms, a number expected to grow as the GenAI learns and adapts. It facilitates natural language queries, allowing scientists to delve deeper into their research questions with refined searches. This feature underscores the engine’s ability to provide rapid, easily digestible information formats crucial for the fast-paced scientific community.