Tech News
← Back to articles

Google’s new weather model impressed during its first hurricane season

read original related products more articles

The Atlantic hurricane season is drawing to a close, and with the tropics quieting down for a winter slumber, the focus of forecasters turns to evaluating what worked and what did not during the preceding season.

This year, the answers are clear. Although Google DeepMind’s Weather Lab only started releasing cyclone track forecasts in June, the company’s AI forecasting service performed exceptionally well. By contrast, the Global Forecast System model, operated by the US National Weather Service and is based on traditional physics and runs on powerful supercomputers, performed abysmally.

The official data comparing forecast model performance will not be published by the National Hurricane Center for a few months. However, Brian McNoldy, a senior researcher at the University of Miami, has already done some preliminary number crunching.

The results are stunning:

Credit: Brian McNoldy 2025 Atlantic season hurricane model performance on track accuracy. 2025 Atlantic season hurricane model performance on track accuracy. Credit: Brian McNoldy

A little help in reading the graphic is in order. This chart sums up the track forecast accuracy for all 13 named storms in the Atlantic Basin this season, measuring the mean position error at various hours in the forecast, from 0 to 120 hours (five days). On this chart, the lower a line is, the better a model has performed.

A new champion

The dotted black line shows the average forecast error for official forecasts from the 2022 to 2024 seasons. What jumps out is that the United States’ premier global model, the GFS (denoted here as AVNI), is by far the worst-performing model. Meanwhile, at the bottom of the chart, in maroon, is the Google DeepMind model (GDMI), performing the best at nearly all forecast hours.

The difference in errors between the US GFS model and Google’s DeepMind is remarkable. At five days, the Google forecast had an error of 165 nautical miles compared to 360 nautical miles for the GFS model, more than twice as bad. This is the kind of error that causes forecasters to completely disregard one model in favor of another.

But there’s more. Google’s model was so good that it regularly beat the official forecast from the National Hurricane Center (OFCL), which is produced by human experts looking at a broad array of model data. The AI-based model also beat highly regarded “consensus models,” including the TVCN and HCCA products. For more information on various models and their designations, see here.