Find Related products on Amazon

Shop on Amazon

Controlling Language and Diffusion Models by Transporting Activations

Published on: 2025-05-04 01:58:02

Large generative models are becoming increasingly capable and more widely deployed to power production applications, but getting these models to produce exactly what's desired can still be challenging. Fine-grained control over these models' outputs is important to meet user expectations and to mitigate potential misuses, ensuring the models' reliability and safety. To address these issues, Apple machine learning researchers have developed a new technique that is modality-agnostic and provides fine-grained control over the model's behavior with negligible computational overhead, while minimally impacting the model's abilities. Activation Transport (AcT) is a general framework to steer activations guided by optimal transport theory that generalizes many previous activation-steering works. The work will be presented as a Spotlight at ICLR 2025, and code is available here. To help generative models produce output that aligns with their users' expectations, researchers often rely on reinf ... Read full article.