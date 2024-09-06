In 1970, British biologist John Maynard Smith came up with a simple wordplay that describes one of the most complex problems in existence. The evolution of proteins, the molecular machines that perform all the functions of life, would be comparable to the distance between two four-letter words, changing just one letter each time. For example: to get from myth to lightning in the minimum number of steps, you go through rite and rato; leaving aside words like raho, which would be discarded by evolutionary improvement.

In this way, all the proteins of interest would be connected at a greater or lesser distance in a continuous space. The alphabet has 27 letters, and proteins have 20 basic units, the amino acids. In 2010, the American engineer Frances Arnold realized that this simple game summarized an unfathomable space: the number of possible variants to create a single small protein, of 100 units, is 20 to the power of 100, more than all the atoms in the universe.

Nature has explored only a tiny fraction of the entire universe of possible proteins over billions of years of evolution. There are countless more to discover, and “we can only dream of the immensity of their capabilities,” reasoned Arnold, who won the Nobel Prize in Chemistry in 2018 for his work on the directed evolution of these molecules.

For the first time, artificial intelligence (AI) is making it possible to explore the entire universe. Its growing computing power has already described the three-dimensional shape of all the proteins that nature has invented — some 200 million — and even predicted their interactions, an achievement announced just three months ago by a company linked to Google.

It is disturbing that the reasoning of this artificial intelligence – or that of any other – is inaccessible to humans. The machine gives the correct answer, but does not detail how it got there. Its creators are also unable to find out; and other scientists are banned, since the technology company does not reveal the base code of its machine. AI is a black box.

This is not just the case with the huge questions of science, but also with a self-driving car that runs a red light and kills a pedestrian, or a financial AI that denies a mortgage to a promising candidate because of the colour of his skin.

Noelia Ferruz, a chemist specialising in bioinformatics born in Zaragoza 36 years ago, has just been chosen by the prestigious European Research Council (ERC) to solve this problem by creating a public, open and self-explanatory artificial intelligence. It will be called Athena.

“Today, artificial intelligence is already at the level of a PhD in chemistry,” explains the scientist, who heads her own research group at the Center for Genomic Regulation (CRG), in Barcelona. But thanks to its ability to devise and study compounds that nature has not invented, it will soon have a “supernatural” power, because it will exceed the terms of nature.

The researcher is the daughter of a housewife and a mechanic who had to go to work without finishing high school. Ferruz has just received 1.5 million euros to develop this system over five years. It will be an “intelligent agent”, a new type of AI capable of analysing different kinds of information: the text of a protein sequence, the three-dimensional image of the resulting molecule, the video of its different moving parts. Alongside Ferruz, a team of three bioinformaticians will train the AI ​​and two molecular biologists will test the new molecules it designs in the laboratory. The researchers will inform the system whether the experiments have worked, so that it will learn from its mistakes. It is one more step towards a more human-like intelligence.

The aim is to design new proteins that have a specific function, especially saving lives if they are used as a drug, for example, more effective antibodies against cancer. Ferruz wants to focus on enzymes, small proteins that speed up biochemical reactions. “25% of current medicines have carbon-fluorine bonds, the synthesis of which is expensive and polluting, which contributes to some costing 60,000 euros per dose,” the researcher explains. “Enzymes have the capacity to make this bond as well, but they are not used for this purpose yet.” Another possibility is to create an enzyme that binds to bisphenol A, a compound in plastic that interferes with human and animal reproduction, and neutralises it.

The most difficult part of the project will not be creating this agent, but understanding it. Current language models such as ChatGPT are made up of layers of artificial neurons. “Each one performs an operation and passes it on to the next. In the end we have up to a billion neurons in the latest versions. We see a constellation of neurons being activated, but we don’t understand why; and the model doesn’t know how to explain it to us,” explains Ferruz.

His goal is to understand how it arrives at its result, which in the field is called explanatory AI. “It is quite laborious, but we can open it layer by layer, see which neurons are being activated in response to certain stimuli, show us what it has learned and understand how it generates protein sequences better than a human,” he explains. The research has a surprising parallel with the study of the brain —100 billion neurons that establish 100 trillion connections—, with the added paradox that, in this case, we have created this intelligence.

Sitting in a glass-walled office, just a few steps from the shore of the Mediterranean, Ferruz says with a smile that she wanted to call this system Pandora. A colleague reminded her that all the evils that plague humanity came from that box, according to the myth. “I asked ChatGPT and they suggested Athena.” [Atenea en inglés]“She is the Greek goddess of wisdom, so it is quite fitting, although she is also a goddess of war.”

The ERC is the most prestigious and demanding scientific body in the European Union. Every year, it selects emerging projects from young researchers, which can then continue their trajectory with new rounds of funding. Ferruz’s project is one of 33 selected in Spain, out of a total of 494.

