How does alphafold work
Last updated: April 1, 2026
Key Facts
- AlphaFold uses deep learning neural networks trained on approximately 170,000 known protein structures from the Protein Data Bank
- The system analyzes multiple sequence alignments and evolutionary information to infer structural constraints on how proteins can fold
- AlphaFold's architecture includes the Evoformer (processes evolutionary data) and Structure Module (predicts atomic coordinates)
- The AI provides confidence scores (pLDDT) for different protein regions, indicating prediction reliability for each part
- DeepMind released AlphaFold2 as open-source software, enabling researchers worldwide to predict millions of protein structures for drug discovery and biological research
What is AlphaFold?
AlphaFold is an artificial intelligence system developed by DeepMind (owned by Google/Alphabet) that predicts the three-dimensional structures of proteins from their amino acid sequences. Released to widespread acclaim in 2020, AlphaFold represented a major breakthrough in structural biology by solving the "protein folding problem"—a challenge that had stumped scientists for over 50 years. The system predicts how proteins fold into their functional three-dimensional shapes with unprecedented accuracy, providing crucial insights for drug discovery, disease understanding, and biological research. DeepMind subsequently released AlphaFold2, an improved open-source version, accelerating biological research globally.
Understanding the Protein Folding Problem
Proteins are long chains of amino acids that fold into precise three-dimensional structures essential for their biological function. Understanding these structures is critical for medicine—misfolded proteins cause diseases like Alzheimer's and Parkinson's, while correct protein structures are necessary for designing effective drugs. Although scientists could determine protein sequences (the order of amino acids) relatively easily using DNA sequencing, predicting how a protein would physically fold into its 3D shape proved extraordinarily difficult. The vast number of possible configurations meant that simply trying all possibilities was computationally impossible. This unsolved challenge became one of biology's grand problems until AI provided a breakthrough.
How AlphaFold Uses Deep Learning
AlphaFold employs deep neural networks—a form of artificial intelligence inspired by how brains learn—to predict protein structures. The system was trained on approximately 170,000 known protein structures from the Protein Data Bank, learning patterns about how amino acids interact and what configurations are likely stable and functional. Rather than following explicit rules programmed by humans, AlphaFold learns implicit patterns from data. The neural network consists of multiple layers of interconnected nodes that process information about the protein sequence, gradually building understanding of how that particular protein is likely to fold.
Evolutionary Insights and Sequence Alignment
A crucial innovation in AlphaFold is its use of evolutionary information. The system analyzes multiple sequence alignments—comparing the target protein sequence with similar proteins from other organisms across evolutionary history. These alignments reveal which positions are conserved (unchanged) across evolution and which vary. Conserved positions typically have structural or functional importance. By analyzing these evolutionary patterns, AlphaFold infers constraints on how the protein can fold. Proteins that perform similar functions across different organisms often maintain similar structures, and this evolutionary relationship provides powerful guidance for structure prediction.
The Folding Algorithm and Architecture
AlphaFold's architecture includes specialized components. The Evoformer processes evolutionary information and generates representations of how amino acids interact. The Structure Module then predicts actual atomic coordinates and three-dimensional positions. Rather than predicting individual atoms one at a time, the system works holistically, considering how different protein regions interact and influence each other. The algorithm iteratively refines its predictions, progressively improving accuracy as it processes information. This end-to-end learning approach allows the system to capture complex relationships about protein folding that would be difficult to encode manually.
Confidence Scores and Practical Application
AlphaFold provides more than just structure predictions—it includes confidence scores (pLDDT scores) indicating reliability for different protein regions. These scores help researchers understand which parts of the prediction are highly trustworthy and which are more uncertain. This transparency is crucial for practical applications. Researchers can confidently use highly confident predictions for drug design or functional analysis, while treating uncertain regions with appropriate caution. The system achieved remarkable accuracy on blind test datasets, often matching experimental results from expensive and time-consuming laboratory techniques like X-ray crystallography.
Impact on Biology and Medicine
AlphaFold's impact has been transformative. DeepMind released AlphaFold2 as open-source software and created the AlphaFold Server providing free predictions for the scientific community. This enabled researchers worldwide to rapidly predict structures for millions of proteins, accelerating drug development, understanding disease mechanisms, and engineering new proteins with desired functions. The breakthrough demonstrated that deep learning could solve longstanding scientific problems, opening possibilities for AI addressing other major research challenges.
Related Questions
Why is protein folding important?
Protein structure determines its function in biological processes. Misfolded proteins cause diseases like Alzheimer's, and understanding correct structures is essential for drug design. Predicting structures helps researchers develop treatments and engineer new proteins for medicine and biotechnology.
What is deep learning and how does it differ from traditional AI?
Deep learning uses neural networks with multiple layers to learn patterns from data rather than following explicit programmed rules. Unlike traditional AI that requires humans to encode domain knowledge, deep learning discovers implicit patterns automatically, making it powerful for complex problems like protein structure prediction.
How accurate are AlphaFold's protein structure predictions?
AlphaFold achieves unprecedented accuracy, frequently matching experimental structures obtained through X-ray crystallography and cryo-EM. The system provides confidence scores for each prediction region, allowing researchers to assess reliability and use predictions confidently for drug design and biological research.
Sources
- Wikipedia - AlphaFold CC-BY-SA-4.0
- DeepMind - AlphaFold Research Fair Use
- AlphaFold Server - EBI Fair Use