DeepMind has predicted the structure of almost every protein so far catalogued by science, cracking one of the grand challenges of biology in just 18 months thanks to an artificial intelligence called AlphaFold. Researchers say that the work has already led to advances in combating malaria, antibiotic resistance and plastic waste, and could speed up the discovery of new drugs.
Determining the crumpled shapes of proteins based on their sequences of constituent amino acids has been a persistent problem for decades in biology. Some of these amino acids are attracted to others, some are repelled by water, and the chains form intricate shapes that are hard to accurately determine.
UK-based AI company DeepMind first announced it had developed a method to accurately predict the structure of folded proteins in late 2020, and by the middle of it 2021 it had revealed that it had mapped 98.5 per cent of the proteins used within the human body.
Today, the company announced that it is publishing the structures of more than 200 million proteins – nearly all of those catalogued on the globally recognised repository of protein research, UniProt.
DeepMind has worked with the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI) to create a searchable store of all this information that can be easily and freely accessed by researchers around the world. Ewan Birney at EMBL-EBI calls the AlphaFold Protein Structure Database “a gift to humanity”.
“As someone who’s been in genomics and computational biology since the 1990s, I’ve seen many of these moments come where you can sense the landscape shifting under you and the provision of new resources, and this has been one of the fastest,” he says. “I mean, two years ago, we just simply did not realise that this was feasible.”
Already delivering results
Demis Hassabis, CEO of DeepMind, says that the database makes finding a protein structure – which previously often took years – “almost as easy as doing a Google search”. DeepMind is owned by Alphabet, Google’s parent company.
The archive has already been used by scientists to advance research in a number of areas. Matt Higgins at the University of Oxford and his colleagues were researching a protein that they believed was key to interrupting the lifecycle of the malaria parasite, but were struggling to map its structure.
“One of the experimental methods that we use is X-ray crystallography,” says Higgins. “We cause the proteins to form into lattices, fire X-rays at them and get information from those X-ray diffraction patterns to see what the molecule looks like. But we were never able, despite many years of work, to see in sufficient detail what this molecule looks like.”
But when AlphaFold was released, it gave a clear prediction of the structure of the protein that matched the information the researchers had been able to glean. They have now been able to design new proteins that they hope could serve as an effective malarial vaccine.
Birley says that using X-ray crystallography to map the structure of a protein is expensive and time-consuming. “That means that experimentalists have to make choices about what they do, and AlphaFold hasn’t had to make choices,” he says. “I think we can be confident that there are new experiments and new insights coming through due to AlphaFold, which will impact ‘how does this particular parasite work’ or ‘why does this particular disease happen in humans’, for example.”
Researchers have also used AlphaFold to engineer new enzymes to break down plastic waste and to learn more about the proteins that make bacteria resistant to antibiotics.
Work still to be done
Keith Willison at Imperial College London says that AlphaFold has unarguably “changed the world” of biological research, but that there are still problems to be solved in protein folding.
“As soon as AlphaFold came out it was wonderful. You just take your favourite proteins and look them up now rather than having to make crystals,” he says. “I did the crystallographic structure of a protein complex, it took me about eight years. People are joking that crystallographers are going to be unemployed.”
But Willison points out that AlphaFold isn’t able to take any arbitrary string of amino acids and model exactly how they fold. Instead, it is only able to use parts of proteins and their structures that have been experimentally determined to predict how a new protein will fold.
While the tool is often, even usually, extremely accurate, its structures are always predictions rather than explicitly calculated results. Nor has AlphaFold yet solved the complex interactions between proteins, or even made a dent in a small subset of structures, known as intrinsically disordered proteins, that seem to have unstable and unpredictable folding patterns.
“Once you discover one thing, then there are more problems thrown up,” says Willison. “It’s quite terrifying actually, how complicated biology is.”
Tomek Wlodarski at University College London says that AlphaFold has had an enormous impact on many areas of biology, but that there are improvements to be made on accuracy, and that developing a model of how proteins fold – not just predicting their final structure – is a problem that DeepMind is yet to tackle.
Wlodarski says AlphaFold isn’t perfect, although it does indicate which parts of a prediction have a high accuracy and which it is less confident in.
“We introduced a mutation, which we know experimentally completely unfolds the protein, but AlphaFold gave me the same structure as it gave without this mutation,” he says. “I did another test: I was removing residues from one end of the protein, because we know that with our protein, if you chop nine residues from one of the ends it will completely unfold the protein. And I managed to chop half of the protein sequence, and the algorithm still predicted it as a completely folded protein with exactly the same structure. So there are these problems.”
Pushmeet Kohli, who leads DeepMind’s scientific team, says that the company isn’t done with proteins yet and is working to improve the accuracy and capabilities of AlphaFold.
“We know the static structure of proteins, but that’s not where the game ends,” he says. ‘We want to understand how these proteins behave, what their dynamics are, how they interact with other proteins. Then there’s the other area of genomics where we want to understand how the recipe of life translates into which proteins are created, when are they created and the working of a cell.”
Sign up to our free Health Check newsletter that gives you the health, diet and fitness news you can trust, every Saturday
More on these topics: