Zero-shot design of drug-binding proteins via neural iterative selection−expansion

Tight coupling between two generative neural networks unlocks the zero-shot design of small-molecule binders, with one network used to sample broadly in sequence space and the other to model coupled changes in protein and ligand coordinates. As a result, NISE can accomplish design tasks that other methods struggled with. First, we used NISE to design helical bundles that bind to exatecan. All four experimentally tested designs bound the drug, contrasting with the state-of-the-art method, COMBS; the tightest COMBS binder was nearly 70-fold weaker. The tightest binder designed by NISE, EPIC, was further improved through neural proofreading and shown to protect the labile drug from hydrolysis for several days. To test NISE on an equal footing with an alternative algorithm using LigandMPNN and Rosetta1,6, we next tackled an identical design challenge: creating binders to apixaban using NTF2 backbones. Here, NISE was implemented with LASErMPNN and Boltz-2, and achieved a success rate (83%) that was more than three orders of magnitude higher than the alternative approach. This result is striking because we used the same previously published backbones as starting points for design. Moreover, tight binding was achieved by NISE using vastly different sequences and binding poses (Extended Data Fig. 9 and Supplementary Table 4). Notably, specialized docking methods were not needed: brute-force rigid-body docking was sufficient to seed NISE. The success rate alone does not fully capture the scale of the advance, as the highest affinity binder from NISE, APEX (K d = 80 pM), bound to apixaban nearly 10,000-fold more tightly than did the best binder from LigandMPNN6 or COMBS4, rivalling even the native drug target52, factor Xa.

NISE differs from previous protocols in a few important ways. First, it allocates most of its compute to extensively sample a few designs. We first perform a broad search to nominate initial self-consistent designs, then NISE executes a deep search to optimize them. Second, NISE uses only neural networks for closed-loop optimization. We investigated the effects of replacing the co-structure predictor with a traditional approach to modelling structure using Rosetta (Fig. 1c, right). In this case, the designs failed to optimize, suggesting that current empirical energy functions cannot fully capture the nuances of productive protein–ligand interactions. Geometrically, the gradient of the Rosetta energy lies partly orthogonal to that of P(structure, sequence, ligand conformation), so following it does not efficiently climb the joint distribution. Further iterations of any Rosetta-based loop would likely not overcome this limitation (Fig. 1c and Supplementary Figs. 9 and 10). By contrast, NISE samples directly from the learned distributions of two reciprocal neural networks, iteratively climbing to a higher probability mode in the joint probability manifold, making the designs look more similar to the training data (structures in the PDB).

NISE is agnostic to the specific networks used; as these models improve, so will NISE. Indeed, a RFAA-based assessment would have led us to discard our apixaban binders (Supplementary Fig. 36); the use of Boltz-2 was important to select binders for experimental testing. LigandMPNN could be substituted for LASErMPNN to produce similar aggregate metrics (Supplementary Figs. 37 and 38), although we observed that LigandMPNN tends to overpack designs (Extended Data Fig. 1 and Supplementary Figs. 39 and 40). Differences in packing might stem from differences in the sampling algorithm: whereas LigandMPNN predicts rotamers only after the entire sequence is designed, LASErMPNN jointly determines the sequence and rotamers.

We focused on optimizing ligand pLDDT during NISE trajectories. This parameter is highly correlated with other metrics, such as protein Cα pLDDT and interfacial predicted aligned error (iPAE), which are commonly used for protein-binder design15,23 (Extended Data Fig. 8 and Supplementary Fig. 41). Adding P(bind) from Boltz-2 increased shape complementarity moderately. Maximizing model confidence and agreement was necessary but perhaps not sufficient by themselves. We found that performing NISE iterations beyond the plateau in ligand pLDDT or score was important for downstream filtering by additional biophysical metrics. Indeed, both EPIC and APEX came from later iterations. Using LASErMPNN and Boltz-2, a typical NISE trajectory of 14 iterations only takes around 5 h on four A6000 graphics processing units (with the majority of compute dedicated to co-structure prediction).

We found that positive design—focusing on moulding the binding pocket only to the target ligand—was sufficient to install a high degree of binding specificity. We showed that both EPIC and APEX were specific: they had the highest affinity for their target ligands and did not bind to dissimilar off-target ligands (Figs. 3g and 6e). For similar on- and off-target molecules, an even wider gap in affinities could be produced explicitly using NISE by selecting representatives from each cycle that maximize both on-target ligand pLDDT (positive design) and the gap in ligand pLDDT (or predicted affinity) of on- and off-target ligands (negative design).

In this work, we initialised designs from two precomputed sets of designable backbone scaffolds, helical bundles and NTF2 folds. These scaffolds have pockets that can fit a variety of ligands for future design targets. As generative backbone models improve (such as RFdiffusion-All Atom8, BoltzDesign1 (ref. 54) and BoltzGen55), we could also use bespoke backbones conditioned on target ligands as inputs to NISE. Currently, we find that using a small, precomputed set of highly designable scaffolds is a computationally efficient approach to achieving high-quality poses. Using structures that can be encoded by many sequences—such as helical bundles and NTF2 folds—enables the efficient sampling of sequences that support both folding and binding. Success with four-helix bundles and mixed α/β NTF2 folds shows that NISE is a general algorithm that does not rely on any specific protein fold, and we expect NISE to work with any fold that can be modelled accurately by co-structure predictors.

The design of small-molecule binding proteins is now approaching the hit rates of the design of PCR primers, yet the rules of protein–ligand molecular recognition are vastly more complex than Watson–Crick base pairing. We have shown that modern deep neural networks, coupled with appropriate training data, can capture and distil this complexity, which can ultimately be extracted using design algorithms such as NISE. We anticipate that these capabilities will catalyse the use of bespoke proteins as rapidly generated reagents to manipulate biology at the single-molecule level.