Author Correction: Foundation model of neural activity predicts response to new stimulus types

Correction to: Nature https://doi.org/10.1038/s41586-025-08829-y Published online 9 April 2025

To ensure accurate documentation of the implemented models, we clarify several architectural details in the Methods describing the Conv-LSTM and CvT-LSTM architectures. These clarifications are limited to the Methods description and do not affect the results or conclusions.

Perspective module: The Methods state that the pupil-position multilayer perceptron MLP uses an 8-dimensional hidden representation; however, in the implemented CvT-LSTM models, this module uses a 16-dimensional hidden representation.

Four-head ensemble: The Methods did not specify that the architecture used for the analyses is implemented as a four-head ensemble. In the implemented model, the modulation, core, and readout modules are independently parameterized across four heads (with shared perspective transform and readout grid), and predictions are obtained by averaging standardized log-responses cross heads.

Modulation module: The Methods state that the modulation network receives three behavioural inputs (treadmill velocity, pupil radius, and the derivative of pupil radius); however, in the implemented CvT-LSTM models, only treadmill velocity and pupil radius are used. In addition, the Methods describe the LSTM hidden and cell states as 8-dimensional; in the implemented models these states are 6-dimensional in the Conv-LSTM variant and 16-dimensional in the CvT-LSTM variant.

Core module (feedforward): The Methods state that the feedforward DenseNet blocks use the GELU nonlinearity; however, in the implemented Conv-LSTM models, the feedforward component uses ELU, whereas the CvT-LSTM models use GELU.

Core module (recurrent): In some Conv-LSTM model variants used in this work, the recurrent module additionally receives explicit spatial information about the visual stimulus. To do this, a spatial grid encoding the position of each feature-map element within the visual field is concatenated to the feedforward features and modulatory vector before entering the Conv-LSTM.

Core module (equations): In the editing process typographical errors were introduced in the equation blocks where before several terms an unnecessary curly brace, {, was added, and in several terms the convolutional operator \({(W}_{k}\ast )\) was incorrectly added as a superscript (\({W}_{k}^{* }\)).