Perturb-MARS: Reading mouse experiments through a human lens

TL;DR: Human data is directly related to what drug development demands, but is difficult to generate and perturb. Mouse data allows us to discover causality and is easy to generate, but is only a proxy for what matters. What has never existed is causal data that speaks in human terms. We’ve built the system that creates it: a combination of wet-lab innovations and machine-learning innovations. The first, Perturb-Map, is a multiplexed in vivo perturbation platform that can test hundreds of genetic knockouts in the same mouse with full spatial resolution. The second is TARIO-2, a foundation model we’ve trained exclusively on human cancer tissue, that can convert H&E to whole-genome spatial transcriptomics. We’ve found that we can apply TARIO-2 directly to H&E derived from Perturb-Map experiments, yielding human-centric tumor microenvironment characterizations from animal data, with no re-training required.

In other words, we can read the results of multiplexed mouse experiments through a lens that only understands human biology.

The combination of Perturb-Map and TARIO-2 is something we call ‘Perturb-MARS’ (Multi-species Alignment and Reasoning on Spatial data), and it allows us to answer not only typical preclinical target discovery questions, but also ones that the current preclinical apparatus is structurally incapable of answering, such as combination therapy exploration; e.g. ‘what is the next PD-1 bispecific target after VEGF and what population is it active in?’. All the while, the readout stays grounded in human-specific genes. We are actively looking for partners here—reach out to [email protected] to learn more.

Introduction

Every life-saving oncology drug on the market first proved itself in a mouse. And yet, the dominant instinct in the field is to abandon animal models entirely, because 95% of the time, what works in a mouse doesn’t work in a human. Why is this? The standard diagnosis is that mouse biology is a bad model for human biology. This isn’t entirely wrong, but we think the field has it backwards. The problem isn’t the mouse, not entirely. The bigger problem is that we’ve always read mouse experiments with mouse-native readouts—mouse gene expression, mouse protein levels, mouse immune cell counts—and then hoped the translation would sort itself out.

Rather than making the animal more human, what if we changed the lens by which we interpret the animal data?

Last month, we released a post on TARIO-2, an internal foundation model trained exclusively on human cancer tissue. Given an H&E-stained human tumor section, TARIO-2 can predict the spatial gene expression, telling you what genes are expressed across the tissue.

We’ve discovered that TARIO-2 generalizes from human to mouse H&E.

In other words, we’re able to feed it the H&E histology of a mouse tumor, and the model is able to predict what a human tumor with similar morphological features would express at the transcriptomic level. What comes out of the model is not a mouse gene readout, but rather the projection of that mouse experiment into a human biological coordinate system. And when combined with an in vivo genetic perturbation platform called ‘Perturb-Map’, we’ve found that the resulting human-space projections of the mouse H&E are both accurate and have clear use-cases for several clinically relevant tasks.

In this essay, we plan to explain this whole process.

... continue reading