Google unveils AI system that predicts flash floods using historical news reports
Google uses Gemini to analyse news reports and build an AI system designed to improve flash flood prediction worldwide.
Flash floods are among the most difficult natural disasters to forecast, often forming rapidly with little warning. Now, Google says it has developed a new artificial intelligence approach that could improve early predictions by analysing millions of historical news reports.
Table Of Content
The technology, called Groundsource, uses the company’s large language model Gemini to extract flood-related information from global news coverage. According to researchers, the system converts unstructured text reports into a structured dataset for training flood prediction models.
Flash flood forecasting traditionally relies on historical data and specialised monitoring equipment. In many parts of the world, however, such records or infrastructure are limited. Google said its new method aims to address this gap by using publicly available information accumulated over decades of reporting.
Flash flood prediction models need historical data and model training that often doesn't exist. Our solution: Groundsource, a new AI-powered methodology that uses Gemini to transform 5M+ global reports into a precise dataset of 2.6M+ flood events.
— Google Research (@GoogleResearch) March 12, 2026
This provides a massive,…
Turning global news coverage into flood data
To build the dataset, Google researchers used Gemini to process more than five million news articles from around the world. The model identified reports related to flooding and extracted key details, including location and timing.
These reports were then transformed into a geo-tagged timeline of events. In total, the system produced a structured dataset containing more than 2.6 million flood incidents. Researchers say this provides a much larger historical record than many existing datasets.
Google described the process as a new AI-powered methodology that converts unstructured written reports into usable scientific data. The company believes the scale of the dataset could significantly improve flood prediction models, particularly in regions where traditional data sources are scarce.
“We’re aggregating millions of reports,” said Juliet Rothenberg, programme manager on Google’s Resilience team. “It enables us to extrapolate to other regions where there isn’t as much information.”
Once the dataset was assembled, researchers trained a predictive model that analysed current weather forecasts alongside historical data. By comparing forecast conditions with patterns from past events, the system estimates the likelihood of flash flooding in a particular area.
Early use through Google’s Flood Hub platform
Google has begun applying the Groundsource dataset within its existing flood monitoring platform, Flood Hub. The service currently highlights potential flood risks in urban areas across 150 countries.
Through the platform, emergency response agencies and local authorities can access risk alerts generated by the model. Google says the data is also being shared directly with disaster response organisations in affected regions.
Although the company has not yet published detailed accuracy results, early feedback from trial users has been positive. One organisation involved in testing said the system helped it respond more quickly to localised weather events.
The initiative marks the first time Google has used a large language model for weather forecasting. However, the company has previously used artificial intelligence in meteorology through systems such as WeatherNext 2, developed by Google DeepMind.
WeatherNext 2 focuses on numerical weather prediction and has shown strong forecasting accuracy in earlier evaluations. The new Groundsource project extends Google’s AI efforts into disaster prediction by combining language analysis with environmental modelling.
Limitations and potential future applications
Despite the potential benefits, the Groundsource system currently has several technical limitations. The model can identify flood risk only within an area of around 20 square kilometres, meaning it may not capture highly localised conditions.
It is also less precise than some established forecasting systems. For example, the system used by the US National Weather Service integrates local radar data that tracks rainfall in real time. Google’s approach does not currently include this information.
Local radar networks are often unavailable in developing regions, which is partly why Google designed the system to work without them. By relying on widely available weather forecasts and historical reporting, the company hopes the technology can be deployed in areas that lack advanced meteorological infrastructure.
Researchers say the underlying methodology could also extend beyond flood forecasting. Because the system converts written reports into structured environmental data, it may eventually be adapted to study other natural hazards.
Rothenberg said the approach could help forecast events such as heatwaves or mudslides, both of which are difficult to predict using traditional datasets alone.
The technology could demonstrate how language models can contribute to scientific research beyond text generation. By analysing large historical datasets, systems like Gemini may offer new ways to build datasets and improve predictive models for natural disasters.





