Un-0: Generating Images with Coupled Oscillators

TL;DR. Executing deep neural networks on GPUs has dominated AI for a decade, but we think the next jump in energy efficiency demands a fundamentally different computer, one where physics does the computing. We built Un-0, an image generator powered by a simulated system of coupled oscillators, an example of an emerging physical computing substrate. On ImageNet 64×64 it reaches FID 6.74, matching the quality of leading conventional image generation methods when they were first published. Weights, training, and ablation code are all open. Join us on an Unconventional journey! Figure 0: A sample of trajectories of Un-0 generations over time. Each line color has an associated box of similar color that denotes the class and generated images over time. Un-0 At Unconventional AI, we’re building a new kind of computer, one that harnesses the laws of physics to do the computing. Our goal is to run modern AI on a fraction of the energy today’s machines need, around 1,000x less. As a first step, we ask: can we train a physical dynamical system to generate images at scale? The best AI models today are conventional deep networks with transformer backbones. However, there is also a long history of alternatives that seek energy efficiency by leveraging the dynamics of a physical system, such as the noisy, time-varying behavior of analog circuits that compute with analog voltage and current instead of conventional digitized numbers. These physics-based alternatives include Neuromorphic Computing (Mead, 1990), Hopfield networks (Hopfield, 1982), and reservoir computing (Jaeger, 2001; Maass et al., 2002). Recently the community has also developed Hamiltonian (Greydanus et al., 2019) and Liquid (Hasani et al., 2021) networks, Neural Wave Machines (Keller & Welling, 2023), Thermodynamic Computing (Coles et al., 2023; Jelinčič, 2025), and Kuramoto Oscillators (Miyato et al., 2025; Song et al., 2025). To exploit these alternative computing methods, the AI task needs to be mapped efficiently to the dynamics of the physical system. Un-0 validates that modern AI workloads can run more efficiently on physical substrates than on today’s hardware. Data space trajectories of images forming for classes: Daisy, Lakeside, Agaric, Geyser, Volcano, Jellyfish. Among a growing community building AI on physical and unconventional substrates [1–8, and others], Un-0 is, to our knowledge, the most capable image generator to date to use a simulation of a physical dynamical system. Un-0 reaches FID 6.74 on class-conditional ImageNet 64×64, though there are still opportunities to improve model performance as a function of parameter count towards the conventional frontier. While the physical primitive we explore is not new, we scale it to a larger generative benchmark, perform an ablated analysis of the dynamics itself, and provide an interpretative analysis of the model’s behavior. We release the model weights together with the training, evaluation, and ablation code to make it easier for anyone to experiment with models grounded in the dynamics of physical systems. We believe it is possible to quickly push beyond Un-0; it is still early in the journey to reseat modern AI on physical dynamics and reach ~1000x energy-efficiency gains. How Un-0 works Figure 1a: Two metronome-like oscillators exhibit three coupling regimes switched across time: 1) drift (no coupling), 2) synchronized (positive coupling) and 3) anti-phase synchronized (negative coupling). Picture two metronomes ticking side by side (Figure 1a). Each can be described at any moment by its phase, the angle where its arm is in the swing. Place two metronomes on the same table and they will interact with each other through the shared surface. Depending on how sensitive they are to each other, i.e., coupling strength, they fall into lockstep or settle into opposition. That’s an oscillator: a primitive component with a phase that wants to rotate at its own rate, influenced by the forces of its neighbors. Figure 1b: Illustration of the evolution of a collection of coupled oscillators.

Now scale that from two oscillators to thousands. A large population of these oscillators, each coupled to each other with their own strength, self-organizes into patterns (Figure 1b). Un-0's compute engine is a large population of oscillators where the coupling strengths between all pairs of oscillators are the primary learnable parameters of the model.

These coupled oscillators are commonly modeled as Kuramoto oscillators. Concretely, each oscillator's motion follows a single rule, applied continuously over time: it rotates at its own natural frequency, nudged by the pull of every other oscillator. The following ordinary differential equation (ODE) describes the evolution of the oscillators over time.

\dot{\theta}_i = \omega_i + \sum_{j=1}^{N} K_{ij}\,\sin(\theta_j - \theta_i), \qquad i = 1, \dots, N

Each oscillator i carries a phase \theta_i \in [0, 2\pi) , and \omega_i is its natural frequency. The matrix K_{ij} specifies the coupling strength that sets how strongly oscillator j pulls i toward or away from alignment. The learning problem for this component of Un-0 is to learn the coupling matrix K and the frequencies \omega ; these are the parameters of the physical system.

Why oscillators? In the brain, rhythmic activity and synchronization are pervasive, and have long been hypothesized to do computational work like binding distributed features into coherent percepts, gating communication between regions, and organizing the timing of spikes (Gray et al., 1989; Buzsáki, 2006; Fries, 2015). Coupled oscillators are among the simplest mathematical models of that kind of behavior, which makes them a natural primitive to study for neuro-inspired models of computation (Winfree, 1967; Kuramoto, 1975; Ermentrout, 1996; Ermentrout et al., 2010). Most important for us at Unconventional, an oscillator is a primitive physical circuit. We can implement a coupled-oscillator system directly in CMOS or other physical substrates such that the physics of the system directly computes the dynamics. That is the bet behind Un-0: if the laws of physics can compute AI workloads, then the execution substrate can look very different from today’s. The model Figure 2: Coupled oscillators (with a unidirectional low rank class conditional matrix from the conditioning oscillators to the pool of oscillators) evolve through time under their trained coupling. Images are read-out at time, T, through a decoder to generate an image. Image distributions are generated by sampling the initial condition many times.

Model Architecture. Inference to generate an image with Un-0 follows five steps:

Start from randomness. Set every oscillator's phase to a random angle \theta_i \in [0, 2\pi) . This random starting state is the seed, i.e., the counterpart to the noise a diffusion model or GAN samples. A different seed yields a different image. Choose the class. A second, smaller group of oscillators drives the requested class (e.g., "daisy," or "volcano") and is coupled into the main population, biasing the main population toward arrangements associated with that class. Let physics execute. Release the system and let the oscillators pull on one another. The oscillators evolve away from their initial random start and settle toward a state dictated by their coupling. Take a snapshot. At a specified time, which we label T , record the phase of every oscillator. That collection of final phases is a grid of numbers, a latent representation of the image. Render. A conventional decoder (under 13% of the model's parameters) turns that latent representation into finished pixels.

Training changes only three things inside the model: 1) how the oscillators are coupled together (the matrix K ), 2) each oscillator's natural frequency ( \omega_i ), and 3) the weights of the decoder. Together, the oscillators replace what would otherwise be a stack of conventional neural network layers.

Why this model architecture? We chose this model architecture to give the dynamics maximum flexibility to perform the computation. Specifically, the forward pass for training requires only 1) setting the coupling matrix, oscillator frequencies, and initial phases, 2) evolving the dynamics, and 3) reading the final image latents. This contrasts with other flavors of dynamical generation, such as diffusion [Sohl-Dickstein et al., 2015] and flow matching [Lipman et al., 2022], that explicitly guide the dynamics during training. However, the trade-off with our approach is that it requires a more complex loss that operates given only generated samples.

... continue reading