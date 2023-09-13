The American architect Irving Geis He received an unusual assignment in 1961: to draw by hand the first structure of a protein revealed thanks to X-rays. It was myoglobin, responsible for oxygenating muscles and the red color of meat. It is a kind of necklace with 153 pearls, which is folded forming eight propellers tangled It took Geis six months to draw it, but her effort managed to awaken world fascination towards that invisible inner world. Now, science has accelerated. The artificial intelligence company DeepMind, owned by Google, managed to accurately predict the structure of more than 200 million proteins last year, almost all known ones. A Spanish bioinformatician, Inigo Barrio, has helped organize that chaos, grouping them in similar ways. His work reveals surprising data. Human beings have 13 exclusive structures, which do not appear in any other living being. A ubiquitous bacteria in soil, Acidobacteria bacteriumhas almost 1,900 unique shapes.

The DNA of a living being is a recipe book for making proteins, the basic building blocks of life. Human beings have about 30,000 different types, which deal with essential functions, such as generating energy, supporting and defending the body against viruses. They are large and complex molecules, some of them with simple shapes—spheres, cylinders, rings, stars, spirals. and even swastikas— and others with unimaginable structures, such as hemoglobin, which transports oxygen through the blood from the lungs to the rest of the body. It has thousands of atoms of carbon, hydrogen, nitrogen, oxygen, sulfur and iron. Its formula is C₂₉₅₂H₄₆₆₄N₈₁₂O₈₃₂S₈Fe₄.

Barrio, born in Pamplona 36 years ago, has faced this tidal wave at the European Bioinformatics Institute, in Hinxton (United Kingdom). The researcher and his colleagues have developed a new algorithm, called Foldseek Cluster, capable of identifying similar patterns in this vast disorder. Barrio used the tool with the database AlphaFold, a jungle of 215 million proteins. The team has identified 2.3 million types of structures, more than 700,000 of them unknown. Understanding the structure of a protein is essential to understanding its function and, potentially, to designing new drugs, as the researchers explain in their study, which is published this Wednesday. in the magazine Natureram of the best world science.

“There is almost always a relationship between the structure of a protein and its function. Almost always. In biology you should never say always,” says Barrio, who recently joined the Wellcome Sanger Institute, also in Hinxton, very close to Cambridge. His work has managed to link proteins of known function with other unexplored ones. “If proteins A and B have a very similar structure, you can infer that they will have a similar function,” explains the researcher. His work is reminiscent of an archaeologist who extracts mysterious prehistoric tools from the underground. “If you discover something shaped like a beak, you may intuit that it is used for stinging, but there are exceptions. A fork and a comb look very similar, but they are not used for the same thing,” he warns.

The database AlphaFold includes predictions made by DeepMind and the European Bioinformatics Institute, part of the European Molecular Biology Laboratory, an organization with more than 1,800 workers in offices in Spain, France, Germany, Italy and the United Kingdom. Analysis of the 215 million proteins suggests that most of the structures appeared very early in the evolution of living beings, in the common ancestor of animals and plants or even earlier. Only 4% of the configurations appear to be specific to a single species.

“Humans have 13 groups of proteins with unique structures,” emphasizes Barrio. The figure contrasts with those of the five organisms that present the most unique three-dimensional shapes: bacteria Acidobacteria bacterium, Escherichia coli and Chloroflexi bacteriumthe Asian spider Araneus ventricosus and the pharaonic cuttlefish, with between 1,400 and 1,900 exclusive structures each. “We tend to see evolution as a linear process, but it is more of a tree. We are at the end of a branch, but bacteria have continued to evolve on their own branches. There are bacteria newer than us,” explains the bioinformatician. “Also, developing a new framework for a new problem is not always the best way to evolve. Many times it they recycle structures. There are proteins of the human species that possibly have a different function than the one they had in our ancestors,” argues Barrio.

Bioinformatician Iñigo Barrio, photographed this Wednesday at the Wellcome Sanger Institute, in Hinxton (United Kingdom). Wellcome Sanger Institute

The British company DeepMind boasts that its artificial intelligence system reaches 95% accuracy. However, nine of the 13 uniquely human structures are based on predictions with high uncertainty, possibly because they are especially disorganized conformations, according to Barrio. The remaining four are VPS53, involved in transport within cells; U54, a herpes virus protein integrated into the human genome; annexins, which are involved in trafficking through cell membranes; and a fourth little-studied protein that could be more of a simple fragment. The 30,000 types of human proteins are grouped into about 9,000 structures.

Another of the main authors of the study, the Portuguese bioinformatician Pedro Beltrao, highlights the discovery of human proteins involved in the immune system and very similar to other bacterial proteins of unknown function. “This suggests that the proteins involved in the immune system could have an ancient evolutionary origin, which we share with species of bacteria. If true, this could transform what we know about immunity,” said Beltrao, from the Federal Polytechnic School of Zurich (Switzerland), in a statement.

The biologist Julia Domingo He considers the new work, in which he has not participated, “very necessary.” “We are entering a new era of big data, and we need new tools to process, analyze and use it at high speed,” she reflects. Domingo developed, together with other colleagues from the Center for Genomic Regulation (CRG), in Barcelona, ​​a method to identify a type of hidden buttons that change the function of proteins. Domingo warns that the structure is not enough to find out the mission. “Other layers of functionality are involved, such as energies and affinity for other proteins,” he points out.

The architect Irving Geis It took him six months to draw myoglobin in 1961. The British chemist who provided him with the data, John Kendrew, won the 1962 Nobel Prize in Chemistry for reading that first structure with X-rays. The possibilities that are now opening up with artificial intelligence and new algorithms are unimaginable, according to Iñigo Barrio. “With previous methods, it would have taken us 10 years to do this work. It took us five days,” he says.

