Among the findings from new genome assemblies, bats have lost genes related to immunity, which may play a role in how they transmit viruses to humans and other animals. (Shutterstock)

Vertebrate Genomes Project Publishes Lessons Learned From 16 Reference Genome Assemblies

In April of this year the Vertebrate Genomes Project (VGP) announced their flagship study including high-quality, near error-free, and near complete reference genome assemblies for 16 species representing six major lineages of vertebrates, including mammals, reptiles, monotremes, amphibians and fish. The lessons learned from these first 16 genome assemblies will help guide the project in their goal of producing reference genomes for the approximately 70,000 living vertebrates. The availability of this genomic data would have wide applications from basic research to conservation.

Harris Lewin, distinguished professor of evolution and ecology in the UC Davis College of Biological Sciences, who serves on the VGP’s leadership council, and postdoctoral researcher Joana Damas are among the coauthors on the Nature paper.

Taking advantage of dramatic improvements in sequencing technology, the study details numerous technological improvements based on these 16 genome assemblies including novel algorithms that put the pieces of the genome puzzle together.

“Completing the first vertebrate reference genome, human, took over 10 years and $3 billion dollars. Thanks to continued research and investment in DNA sequencing technology over the past 20 years, we can now repeat this amazing feat multiple times per day for just a few thousand dollars per genome,” said Adam Phillippy, chair of the VGP genome assembly and informatics working group of over 100 members and head of the Genome Informatics Section of the National Human Genome Research Institute at the NIH in Bethesda, Md.

Comparing bat genomes

The excellent quality of these genome assemblies enables unprecedented novel discoveries which have implications for characterizing biodiversity for all life, conservation, and human health and disease.

Lewin and Damas contributed a comparison of six bat species’ genomes, with Canada lynx, platypus and chicken genomes as outgroups and the human genome as a reference. They were able to define conserved blocks of DNA and evolutionary breakpoints, showing that the rate of evolution accelerated in bats after the last mass extinction 66 million years ago.

“The completeness of these genomes allowed us to look very thoroughly at the breakpoint regions, which flank chromosomal rearrangements,” Damas said.

By doing that, they also found some chromosomal rearrangements associated with gene loss in bats, including one associated with loss of genes related to the immune system.

“We found bat species to have lost from two to the twelve genes present in this locus in humans,” Damas said.

These gene losses could be related to bat-specific differences in immunity to infectious agents including SARS-CoV-2, the virus that causes COVID-19, Lewin said.

Data from the project could also play a role in protecting rare and endangered species. In collaboration with officials in Mexico, genomic analysis of the vaquita, a small porpoise and the most endangered marine mammal, suggest that harmful mutations are purged from these small populations in the wild, giving hope for the species’ survival.

Thousands of species, hundreds of scientists

The VGP involves hundreds of international scientists working together from more than 50 institutions in 12 different countries. As the first large-scale eukaryotic genomes project to produce reference genome assemblies meeting a specific minimum quality standard, the VGP has thus become a working model for other large consortia, including the Bat 1KPan Human Genome ProjectEarth BioGenome ProjectDarwin Tree of Life, and European Reference Genome Atlas, among others.  The Earth BioGenome Project is chaired by Lewin, and the secretariat of the project is located on the UC Davis campus.

As a next step, the VGP will work to complete phase one of the project, approximately one representative species from each of 260 vertebrate orders separated by a minimum of 50 million years from a common ancestor.

Phase two will focus on representative species from each vertebrate family and is currently in the progress of sample identification and fundraising.

The project also collaborated with DNAnexus and Amazon to generate a publicly available VGP assembly pipeline and host the genomic data in the Genome Ark database. The genomes, annotations and alignments are also available in international public genome browsing and analyses databases. All data are open source and publicly available under the G10K data use policies.

More information:

Primary Category