The tools available for capturing genomic information have evolved dramatically over the past decade, yet plant and animal reference genomes remain haploid. Through advances in Dovetail® proximity ligation (Hi-C) technology, the capture of genetic variation along with ultra-long-range genomic sequence information in a single assay is now possible. A notable benefit to this enhanced data type is the ability to phase SNPs over extremely large distances.
The newest Dovetail® de novo assembly workflow utilizes a unique combination of:
- PacBio HiFi sequencing for best-in-class base calling accuracy
- Dovetail® Omni-C® scaffolding for unprecedented SNP coverage and long-range information
Unlike traditional Hi-C approaches that digest chromatin with sequence-specific and biased restriction enzymes, Dovetail® Omni-C® technology utilizes DNaseI to randomly fragment chromatin. This generates unbiased coverage of the genome, with no blind spots, enabling SNP detection rates approaching those achieved with standard shot-gun libraries. As a result, assemblies produced from this data display extremely long haplotype blocks when phased, up to full chromosome-length.
What is the benefit of a chromosome-scale reference genome with phased SNPs? While haploid reference genomes have proven highly informative, they do not tell the full story. Having access to SNP phasing information at chromosome scale opens up new opportunities to better understand:
- distribution of heterozygosity
- allele-specific epigenetic events
- cis versus trans mutations
- and more.
As just one example of how this new approach is being applied, Dovetail Genomics® and Dr. Philip Lavretsky of the University of Texas, El Paso recently began a collaborative project aimed at leveraging Variant Call Format (VCF) genome assemblies in a population genetics (popgen) study design. “My interest lies in bridging the gap between evolutionary and wildlife genetics as an informative means for conservation and management efforts,” states Dr. Lavretsky. “Specifically, I believe that our ability to identify and understand what species, population, etc. are in regard to adaptive and non-adaptive traits is essential when attempting to establish potential plans. In addition to conservation implications, I am interested in understanding the underlying evolutionary drivers impacting genomes as a means to understand the primary drivers of speciation, particularly at the earliest stages of divergence.”
“I am excited to build very large haplotype blocks for the 15 sub-sampled birds using this unique datatype. In addition, the Omni-C data is expected to reveal large structural variations, indels and TADs that will further empower the study.”
Dr. Lavretsky’s primary organism of study is the mallard. While most mallards exist in their wild-type state, ongoing mallard domestication is leading to an increase in wild/domestic hybridization within the species, and with potentially harmful effects on wild populations. In this study with Dovetail, supported in part by Dr. Lavretsky’s recent NSF-DEB award (Number – 2010704), Dr. Lavretsky is re-sequencing 121 birds with various states of wild/domestic hybridization. Of these 121 birds, 15 representative individuals (e.g., wild-type parents, domesticated parents, F1, F2 and F3 individuals) will be sub-sampled and re-sequenced using the Omni-C technology. Dr. Lavretsky states “I am excited to build very large haplotype blocks for the 15 sub-sampled birds using this unique datatype. In addition, the Omni-C data is expected to reveal large structural variations, indels and TADs that will further empower the study.”
Dr. Lavretsky’s presentation at Dovetail’s Genomes of Animals & Plants Virtual Conference (GAP2021).
We at Dovetail hope this collaboration will provide a recipe and framework that will enable others to similarly utilize Omni-C data. To bring this paradigm shift to the research community, Dovetail has recently begun enrolling applicable projects in an Early Access program. Through this unique program, customers will now receive their de novo assembly in VCF, with both heterozygous SNPs called, and in blocks potentially spanning entire chromosomes.