Skip to content
Tech News
← Back to articles

AlphaFold database hits ‘next level’: the AI system now includes protein pairing

read original get Protein Structure Viewer → more articles
Why This Matters

The expansion of the AlphaFold database to include protein complexes, such as homodimers, marks a significant advancement in molecular biology. This development enhances researchers' ability to understand protein interactions crucial for biological functions and disease mechanisms, accelerating drug discovery and biomedical research. By providing comprehensive structural data, the database empowers the global scientific community to explore complex biological processes more effectively.

Key Takeaways

AlphaFold is now capable of predicting homodimeric complexes, including those formed by the transcription elongation factor Eaf, the N‑terminal region of which is shown here.Credit: Google DeepMind/EMBL-EBI (CC-BY-4.0)

A database containing the predicted structures of nearly every known protein on Earth has grown even larger and become more useful for understanding how the building blocks of life work together.

For the first time, the AlphaFold protein-structure database will include predictions of complexes of proteins — with the addition of 1.7 million ‘homodimers’ comprising two interacting strands of the same molecule.

The freely available database, maintained by the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) in Hinxton, UK, currently holds around 200 million predictions of individual protein structures, made using the AlphaFold2 artificial-intelligence tool, developed by London-based firm Google DeepMind.

Since its release in 2021, this repository has become a bedrock in discovery and a first port of call for research projects that try to understand life at the molecular level. But previous iterations of the database lacked predictions of how proteins form complexes, which can be indispensable for their function. For instance, HIV-1 protease — a viral protein that is a key drug target — works only when two copies of the same protein form a working enzyme.

AlphaFold is five years old — these charts show how it revolutionized science

Such proteins were already included in the database as individual ‘monomers’ but their entries tell only part of their story. “We thought, ‘can we bring the AlphaFold database to the next level, where we can include a lot of complex predictions across the tree of life?’” says Martin Steinegger, a computational biologist at Seoul National University in South Korea, who was part of the effort.

Complex interactions

To make predictions for even small complexes of two proteins was a crucial challenge, says Steinegger. “It is quite a different beast than monomer predictions.” Protein-complex predictions are exceedingly intensive computationally, so a consortium — including Steinegger’s laboratory, EMBL-EBI, Google DeepMind and chipmaker NVIDIA in Santa Clara, California — was formed to take on the challenge.

The consortium focused on protein complexes from 20 of the most studied species, including humans, mice, yeast and bacteria that cause disease in humans, such as Mycobacterium tuberculosis.