Recently, Eon Systems PBC co-founder and founding advisor Dr. Alex Wissner-Gross shared some of the work that we’ve been doing on X, and we were pleasantly surprised at how much attention it’s received. This embodied fly is still very much a work-in-progress, and a first step towards showing how an embodied brain would control a virtual body. We wanted to discuss here how the virtual fly works, and its limitations. This post will be necessarily quite technical.
First, we want to acknowledge how much this project depends on the broader neuroscience community. Our work builds directly on the adult fly connectome (Dorkenwald et al., 2024), on connectome-constrained brain models (Lappalainen et al., 2024), on neuromechanical fly body models (Wang-Chen et al., 2024; Ozdil et al., 2024), and on decades of work mapping sensory circuits, descending neurons, and behavior in Drosophila. The current system is an integration effort, most specifically of existing brain models and existing virtual body models. We’d also like to state that this work was a true team effort conducted by Scott Harris, Aarav Sinha, Viktor Toth, Alexis Pomares, and Philip Shiu.
How does the fly work?
In the video, the fly uses invisible taste cues to navigate the environment towards a food source (stylized as slices of banana). Fictive dust accumulates on the fly, so the fly stops, grooms itself, then continues towards the food, and commences eating.
For the brain, the main starting point is the model from Shiu et al.: a leaky integrate-and-fire (LIF) model built from the adult Drosophila central-brain connectome, with approximately 140,000 neurons and roughly 50 million synaptic connections, using inferred neurotransmitter identities to determine the sign of synapses (Eckstein et al., 2024). That model showed that connectome structure alone can recover substantial sensorimotor structure for behaviors such as feeding and grooming, which is exactly why it is such a useful substrate for embodiment. This model depends on the broader FlyWire effort and the systematically annotated whole-brain resource of 140,000 neurons (Schlegel et al., 2024).
We also use the Lappalainen et al. visual model, a model of the fly visual motion pathway. In that work, the authors built a connectome-constrained recurrent network for 64 visual cell types, spanning tens of thousands of neurons across the visual field, and showed that with connectivity plus task constraints they could predict neural activity across the motion system. Combined with the NeuroMechFly virtual body, this allows us to predict the activity of the visual system; we then “pipe in” that information into the Flywire LIF model.
To embody the brain, we use a published neuromechanical fly body, NeuroMechFly (Wang-Chen et al., 2024), which represents the fly as an anatomically structured articulated body with physically simulated joints, forces, contact, and actuation. It has 87 independent joints embodied in a precise 3D mesh that was created from an X-ray microtomography scan of a biological fruit fly (Wang-Chen et al., 2024). The digital fly runs on the MuJoCo physics engine, which provides high-fidelity, physically-constrained environments for behavioral simulations (Todorov et al., 2012).
NeuroMechFly v2, already implemented sensory inputs, including simulated vision and olfaction, which we have used. Fly walking was implemented using slight modifications to existing NeuroMechFly controllers, trained to imitate the walking behavior of the fly. We also note that the Vaxenburg et al. whole-body model, which we did not use, also showed realistic walking and flight using reinforcement-learned controllers and high-level steering signals.
Conceptually, the full loop has four parts. First, sensory events in the virtual world are mapped onto identified sensory neurons or sensory pathways. Second, brain activity is updated in a connectome-constrained neural model. Third, selected descending outputs are translated into low-dimensional motor commands for the body. Fourth, the resulting movement changes the sensory state, which is fed back into the brain. We currently run the syncing steps between the brain and body every 15 ms, calculate the brain’s response to sensory input, and then simulate the body’s response for 15 ms. We note that this 15 ms time step may be too slow for some behaviors.
Sensory input: how the virtual world enters the brain
... continue reading