the biochemist David Baker (Seattle, United States, 61 years old) leads a technological revolution that could change science and medicine forever. To understand his full potential, you have to travel to the depths of a living being: his DNA. This molecule stores all the instructions needed to make proteins using four-letter combinations: ACGT. Proteins are responsible for almost any biological process you can imagine, from a tree growing to a firefly glowing in the dark to a person thinking, breathing, digesting their food and everything in between thanks to at least 20,000 different proteins.

Understanding how proteins take their shape has been one of the most daunting problems in biology for more than half a century. Transcription and translation of DNA within cells produces a linear sequence of amino acids arranged in single file. But in fractions of a second that string twists on itself to form perfect three-dimensional structures capable of embracing, cutting, gluing, absorbing, transmitting, producing. Nothing in the resulting two-dimensional DNA or amino acid sequences allows one to predict what shape the final molecule will have; and it is its shape that determines its function. Calculating all the possible shapes of a single protein with conventional methods could take more than 13.7 billion years, the age of the universe. And in nature there are hundreds of millions of different proteins.

Last summer, AlphaFold, an artificial intelligence system developed by Deepmind, a company owned by Google, solved the shape of almost all known proteins: more than 200 million. The historic achievement was made possible by deep learning systems. These sets of algorithms mimic the functioning of neurons in the human brain. Although they are still far from matching the capacity of our brain, they are very efficient at finding patterns in huge databases. Thanks to these systems, solving the shape of a protein is now done in minutes, instead of years.

David Baker’s laboratory at the University of Washington (United States) goes a step further. He has developed several open-ended artificial intelligence systems that create proteins that have never existed in nature. The system RoseTTAFold and its successors make it possible to design new proteins with amazing functions with unprecedented ease, such as blocking all variants of covid or fighting diseases with no known cause, such as Crohn’s or idiopathic pulmonary fibrosis. His team is perfecting a system of tools to “speak a protein”, that is, describe its function with the voice and have the computer provide its complete sequence. It also seeks to provide only part of a protein and for the system to auto-complete it, as if it were a Google search.

Baker has just won the BBVA Foundation Frontiers Award for Biomedicine together with his Deepmind colleagues Demis Hassabis and John Jumper. In this interview, conducted via videoconference, he talks about the enormous potential of this technology. One of his most achievable goals is to create a nasal spray that blocks the entry of influenza, syncytial virus, coronavirus and other winter respiratory pathogens thanks to artificially-engineered proteins.

Ask. You say that this technology will change the world more than the Stone Age or the Industrial Revolution. Because?

Answer. Until recently, all the proteins we knew of were those created by nature over thousands of years of evolution. They were like an elvish language that was given to us. Until now, what we did was take those old proteins and make small modifications to them to get new functions. In the same way, humans picked up stones and beat them sharp; this is how the first tools of the Stone Age were made. Now, for the first time, we can create new proteins from scratch that do exactly what we want them to do. It is a human technology that takes us beyond the possibilities of biology.

Q. What applications will it have?

R. The first thing we are going to see is an impact on medicine, with better and cheaper drugs. About seven years ago we began to develop an icosahedron-shaped protein [un poliedro de 20 caras]. Its appearance was very similar to the envelope of many viruses, but it was totally artificial. My colleague Neil King added the coronavirus receptor-binding domain to it, and it turned out that the molecule caused strong immunity against the actual virus. A few years later, one of our first proteins has already been approved as a covid vaccine and is used in humans in Korea, for example. We are also looking for proteins that improve cancer treatments and others that are capable of generating solar energy or serving as new materials. The possibilities are almost endless.

For the first time, we can create new proteins from scratch that do exactly what we want them to do.

Q. What is the limit of this technology?

R. One way to know the limit is to take evolution into account. Everything that living things on this planet are capable of is due to protein. And all those proteins were created by pure chance in a random process of mutation and selection. No set plan. Now let’s think about how for the first time people can design new proteins to solve problems at will. The possibilities go far beyond what we can imagine.

Q. His new system can design proteins on demand by talking to the computer. Can you, for example, ask for a protein that cures Alzheimer’s?

R. We can give a simple description of a problem and deep learning systems will provide the sequence of proteins with those properties. But the system is still not perfect. Once a new molecule has been designed by computer, it must be created using conventional methods in the laboratory and check that it has the desired properties. The novelty is that now we can use nature to speed up this step. Once we have the amino acid sequence of our protein, we encode it into a DNA sequence, a synthetic gene, which we then introduce into a bacterium. And this bacteria basically becomes our protein factory. Can we design a protein that cures Alzheimer’s? The problem is that we do not fully understand the cause of this disease. Yes, we have created molecules that bind to the pathological protein fibrils that characterize it, but we do not know if they are the cause. So there’s still a long trial and error and that’s the tricky part. The problem of designing proteins is solved. The challenge is knowing how to formulate the problem we want to solve. We need a molecular hypothesis. And for that you have to understand the origin of the disease.

Q. How reliable is this technique?

R. It depends on the problem. For a simple question, the success rate is 75%. It is so new that we are still learning. In the case of influenza, we have been able to design proteins and test them in a matter of weeks, for example. This can be very useful in case you have to react to a new pandemic. But with more complex problems it is still very difficult. For example: degrade plastic. It is such a broad issue that we still do not know how to address it well.

Q. One of their goals is to develop a nasal spray that we could use to protect ourselves from many respiratory viruses at once. When do you think it would be possible to have it?

R. It depends, because economic factors come into play. This type of drug would not be so profitable for pharmaceutical companies, so it would be necessary to see if any company, government or non-profit organization would want to develop it. It is a very common problem in the field of infectious diseases. But, from a technological point of view, I think that this year we will know if these sprays work against covid. And if they work, it’s reasonable that they work against other respiratory viruses as well.

If you are a criminal, you do not need to design proteins by artificial intelligence, the genetic sequences of the 1918 flu virus are already available

Q. Do you see any danger in this technology?

R. Nature has already perfected systems to cause death and destruction on a scale far greater than human. Let’s think about the 1918 Flu, that it was tremendously lethal and transmitted rapidly. If you are a criminal, you don’t really need to design proteins by artificial intelligence because you already have the genetic sequences of the mentioned virus, or Ebola, for example, available to you. As with any such powerful technology, we’ll have to make sure it’s not misused, but I think the risk is small.

Q. Do you see any cause for concern that Deepmind is owned by Google and that they are so secretive about the work they do?

R. There is a big difference between my laboratory, which is totally open, we receive visitors from all over the world and we share information, and a company like Deepmind, which is totally closed. When Deepmind publicized one of its great advances in this field two years ago, there were many apocalyptic comments from the scientific community, warning that at this rate the big technology companies would be the only ones to dominate this technology. I think the fact that we created RoseTTAFold, a system open to everyone, helped Deepmind eventually open their systems to the public as well because I’m sure there were people within the company who preferred to keep them secret and make money from them. Deepmind is still very secretive and I think that this asymmetry between them and us will come at a cost. Only in the last few days we have had visits here from very powerful researchers in Alzheimer’s, in solar energy systems and in the development of new drugs against cancer. Being an open system gives you many more ideas. The free exchange of information benefits the advancement of science.

