He human genome It is like an instruction manual for the body. A scientific milestone was achieved in 2003 when the $3 billion Human Genome Project concluded with a map containing the key sequence of our DNA, which was completed by 2022. It was based on a small group of people of European origin and did not cover human diversity, but now the first draft of the human pangenome created with the genetic sequences of 47 individuals from different parts of the worldincluding all the continents except Antarctica, and which refers to the set of genes of our entire species.
In the magazine Nature This first draft is described in a series of three articles and a complementary one –collected in Nature Biotechonology–, although the final objective of the project of the Human Pangenome Reference Consortium is to include genetic material from 350 people in 2024. All of us carry in our DNA two copies of genes inherited from our parents (one from the mother and one from the father), so this reference pangenome contains information from 94 individual genomes of different ethnic origin.
The individual genomes present in the pangenome reference contain haplotype-resolved information, meaning that it can accurately distinguish the two parental sets of chromosomes, a major scientific feat. Having this information will help scientists better understand how various genes and certain diseases are inherited.
“This new pangenome will substantially improve genetic diagnosis, both for rare diseases and, especially, for complex diseases”
“We are introducing more diversity and equity into the reference by sampling diverse human beings and including them in this framework that everyone can use,” he said. Benedict Paten, associate director of the Santa Cruz Genomics Institute at the University of California and one of the project leaders. “One genome is not enough to represent everyone; the pangenome will ultimately be something inclusive and representative” because it provides a more complete image and will allow for more precise analyzes when characterizing the genetic variability of the population regardless of its origin, this expert highlighted.
Pangenome, a big step towards personalized medicine
The elaboration of the pangenome has been possible thanks to the development of advanced computational techniques to align the multiple sequences of the genome into a single usable reference in a structure called a pangenome graph. The methods used have achieved that all the genomes within the pangenome reference are of extreme quality and precision, covering more than 99% of each human genome with more than 99% precision.
The new pangenome has revealed 119 million new bases –each of the letters that make up the genome–, and has identified new alleles in structurally complex regions of the genome that until now were not included in the reference genome. These new databases will help researchers to study regions of the genome for which there was previously no reference and, potentially, will be able to associate structural variants with diseases in future studies.
In the opinion of Gemma Marfany, Professor of Genetics at the University of Barcelona (UB) and head of the CIBERER group, in statements to SMC Spain: “It was known that a large part of human genetic diversity resides in structural variants (large duplications and deletions), more than in the point variants. This new pangenome has discovered up to 1,115 new duplications, adding close to 119 million bases to the human genome (which contains 3,300 million bases), which represents a substantial improvement in quantity and quality”.
The researcher adds that, however, there are certain limitations to take into account: “They have only been able to sequence the complete genome of 47 people of very different origins (51% African (which are the most genetically diverse), 34% of Americans, 13% Asians and only 2% Europeans – the least because they are the most represented in genomic data). Although it is a great advance, they cannot represent all the human genetic variability. The researchers propose to sequence 350 more genomes, to increase representativeness and genetic diversity. On the other hand, there is still a relatively high error in sequencing. Although this error is one base every 200,000 bases, since billions of bases per individual are sequenced, there is still an associated error that must be taken into account and improved in the future.”
“This new pangenome is going to become the new reference human genome that researchers will use in our daily activity. It will substantially improve genetic diagnosis, both for rare diseases as, especially, of complex diseases, in which structural variants have been a workhorse, since they were not easily detectable with the current reference genome and the techniques used up to now. New algorithms will be developed that will allow a greater accuracy in diagnosis”.
Speaking to Science, Heidi Rehm, a geneticist at Massachusetts General Hospital, the new pangenome could also be a breakthrough for rare genetic diseases. These conditions are difficult to study because the mutations that cause them might not appear in the latest edition of the human genome made in 2017 (known as GRCh38). The pangenome, she says, might be a better tool for identifying such gene mutations and diagnosing patients. “That’s significant.”
“This is not a conceptual leap, but a substantial advance in precision. The more knowledge we have about our genome, the greater the precision in the genetic inferences derived from the analysis of our genome, particularly in reference to precision medicine, also called personalized medicine”, concludes Marfany.