Large language models trained on vast datasets could speed genomics research, streamline clinical documentation, improve real-time diagnostics, support clinical decision-making, accelerate drug discovery, and even generate synthetic data to advance experiments.
But their promise to transform biomedical research often runs into a bottleneck: beyond the structured data healthcare relies on, these models struggle in edge cases like rare diseases and unusual conditions, where reliable, representative data is scarce.
New York-based Mantis Biotech claims it’s developing the solution to fill this data availability gap. The company’s platform integrates disparate sources of data to make synthetic datasets that can be used to build so-called “digital twins” of the human body: physics-based, predictive models of anatomy, physiology, and behavior.
The company is pitching these digital twins for use in data aggregation and analysis. These digital twins could be used for studying and testing new medical procedures, training surgical robots, and simulating and predicting medical issues or even patterns of behavior. For example, a sports team could predict the likelihood of a specific NFL player developing an Achilles heel injury based on their recent performance, training load, diet, and how long they’ve been active, Mantis’ founder and CEO Georgia Witchel explained to TechCrunch in a recent interview.
To build these twins, Mantis’ platform first takes data from a variety of sources such as textbooks, motion capture cameras, biometric sensors, training logs and medical imaging. Then, it uses an LLM-based system to route, validate, and synthesize the various data streams, and runs all that information through a physics engine to create high-fidelity renders of that dataset, which can then be used to train predictive models.
“We’re able to take all these disparate data sources and then turn them into predictive models for how people are going to perform. So anytime you want to predict how a human being is going to be performing, that is a really good use case for our technology,” Witchel said.
The physics engine layer is key here, Witchel told TechCrunch, because it helps the platform enhance the available information by grounding the generated synthetic data and realistically modeling the physics of anatomy.
Techcrunch event Disrupt 2026: The tech ecosystem, all in one room Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $400. Save up to $300 or 30% to TechCrunch Founder Summit 1,000+ founders and investors come together at TechCrunch Founder Summit 2026 for a full day focused on growth, execution, and real-world scaling. Learn from founders and investors who have shaped the industry. Connect with peers navigating similar growth stages. Walk away with tactics you can apply immediately
Offer ends March 13. San Francisco, CA | REGISTER NOW
... continue reading