A man opened a newspaper on March 23, 1997 and changed the world forever. On one of the pages of the American newspaper The Buffalo News there was a flashy ad: “20 volunteers are being sought to participate in the Human Genome Project. […] The result will have a huge impact on the future progress of medicine.” That reader responded to the call, donated a few milliliters of blood and joined a 3,000 million dollar project that led in 2003 to the so-called human reference genome, composed of 70% for that man’s DNA, with remnants of another two dozen people. That genetic information, indeed, changed the history of humanity, but it has been insufficient, by excluding the diversity of the human species. An international consortium publishes this Wednesday a more sophisticated alternative, made with the genetic sequences of 47 people from different regions of the planet. It is the first draft of the so-called human pangenome.

A person’s genome, their DNA, is the instruction manual present in each of their cells. It is a text of about 3,055 million letters (ATGGCGAGT…), in which each letter is simply the initial of a chemical compound with different amounts of carbon, hydrogen, oxygen and nitrogen. G, for example, is guanine: C₅H₅N₅O. The genome of two people matches 99.9%, but the remaining 0.1% are millions of letters that make a human being unique and that can hide the keys to their diseases. If the 2003 reference genome is a linear sequence, the new human pangenome can be imagined as a road map in which an individual genome is a particular path, in the words of Benedict Patencomputational biologist at the University of California at Santa Cruz (United States) and one of the leaders of the research.

If the 2003 reference genome is a linear sequence, the new human pangenome is like a road map.

The draft adds 119 million letters to the model used so far. The authors of the work, grouped into the Human Pangenome Reference Consortium, explain that the low diversity of the current reference genome has caused “a streetlight effect”, a phenomenon that owes its name to the joke of the drunk who is looking for the keys to his house on the ground of a street, at night, under a streetlight. lit. A policeman tries to help him and, after a few minutes of unsuccessful searching, the agent asks the drunk neighbor if he is sure that he has lost the keys there. “No, I dropped them in the park, but this is where there is light,” the man replies. Scientists have spent two decades looking for possible genetic variants where it was easier to look: within the limits of the reference genome, which in addition to ignoring human diversity was full of holes due to the lack of precision of the technology.

Benedict Paten and his colleagues have worked for years to develop new tools capable of reading DNA with unprecedented accuracy, with just one error per 200,000 letters. Several members of the team have also participated in the T2T Consortium, which achieved the first truly complete sequence of a human genome a year ago. Until then, only 92% had been read. The remaining 8% were like the blue sky pieces in a puzzle: too repetitive to easily find their position.

A “fairer” medicine

the geneticist Karen Crumb, from the University of California at Santa Cruz, proclaimed at a press conference on Tuesday that the diversity of the pangenome ushers in a new, “more just” era in medicine. The 47 genomes incorporated so far come mainly from Africa (24) and America (16), including four Peruvians from Lima, another four Colombians from Medellín, and eight Puerto Ricans. Six genomes are Asian and only one is from Europe, a continent that is already overrepresented in genetic databases. The goal of the team is to reach 350 complete genomes in a single pangenome, which will be published in mid-2024. The first draft is presented this Wednesday in the magazine Nature.

Computational biologist Benedict Paten, from the University of California at Santa Cruz. UCSC

The Spanish scientist santiago mark, who has developed algorithms and software tools for the pangenome, explains the magnitude of the technical challenge. Today’s machines cannot read a genome in one go, but read billions of tiny fragments randomly and repeatedly. “Assembling a person’s genome is like reconstructing a great book, with 3,000 million letters, putting jumbled paragraphs and pages together, as if it were a great puzzle,” says Marco, from the National Supercomputing Center in Barcelona. “Building a reference pangenome may require processing 100 times more information,” he warns.

the bioinformatician Francisco Martinez Jimenez uses the reference genome on a day-to-day basis as a model to search for specific alterations in the tumors of patients at the Vall d’Hebron Institute of Oncology, in Barcelona. The specialist explains that if the ancestors of the patient are, for example, from South America, Africa or Southeast Asia, it is “much more difficult” to detect these alterations, because the current reference genome is mainly made with DNA from people of European origin. . “That there is genetic diversity in the pangenome is very relevant, particularly in cancer,” he applauds.

Martínez Jiménez has led the analysis of the complete genome of more than 7,000 primary and metastatic tumors, of 71 types of cancer. His results, also published this Wednesday in the magazine Nature, show that in certain types of tumors, such as those of the prostate, thyroid and some of the breast, the genetic differences between the primary cancer and the metastasis are “very important”, while in others, such as the pancreas, they are subtle. “Metastases per se They do not seem to be explained, in general, by a specific genomic alteration, but possibly by changes in the microenvironment of the tumor, such as a deprivation of the immune system in certain locations or a greater irrigation by blood vessels, with more nutrients”, emphasizes the bioinformatician, who carried out the work at the University of Utrecht, in the Netherlands.

The biologist Benedict Paten insists that the human pangenome is currently a draft and asks for patience until a real impact on medicine is seen. “There are assembly mistakes—not too many, but some—that we knew we were going to make that we want to correct,” he admits. Another co-author of the study, Erik Garrison, from the University of Tennessee, has shown his enthusiasm in a statement. In his opinion, the first draft of the human pangenome “is as exciting and unexpected as the first observations of unknown regions of our own planet or of the solar system, but in this case they are so close that they literally define our physical nature.”

You can write to [email protected] or follow SUBJECT in Facebook, Twitter, instagram or subscribe here to our bulletin.