Dovetail® Scaffolding

Catalog # 20011

Dovetail® Scaffolding

Catalog # 20011

A highly contiguous genome assembly is critical for efficient and accurate downstream analyses. Improve the contiguity of any existing genome assembly with scaffold N50 > 100 kb using Dovetail® Omni-C® scaffolding technology and HiRise® software algorithm. Unlike traditional Hi-C methods that utilize a restriction enzyme(s), Omni-C digests chromatin using a sequence-independent endonuclease for even, unbiased whole genome coverage. HiRise is our industry-leading scaffolding software that we have used to assemble genomes of >1,600 species. Industry-leading software accuracy coupled with manual correction by a qualified bioinformatician will get you a chromosome-scale assembly you can trust.

Why Add a Scaffolding Project to your De Novo Genome Assembly Service?

For the vast majority of genomes, long-read sequencing is insufficient to yield a chromosome-scale assembly. A Dovetail® Scaffolding Project will improve contiguity for the highest quality assembly and will enable you to:

  • Enjoy full service – from sample to publishable assembly under one roof.
  • Get access to the latest cutting-edge technologies: PacBio HiFi, Dovetail Omni-C, HiRise software and the Dovetail® Annotation Service Suite.
  • Receive a full genome: don’t settle for half a genome – new chromosomal-scale phased SNP calling for 2021 assemblies!

How It Works

Ship us a flash-frozen tissue sample and upload your existing assembly with scaffold N50 >100kb. A qualified bioinformatician will check the assembly for accuracy and completeness. If your assembly passes our QC thresholds, a dedicated project manager will set up your scaffolding project.

We will construct Omni-C proximity ligation libraries, QC the libraries, deep sequence with Illumina PE 150, and scaffolding using the HiRise® algorithm. Topologically associated domains (TADs) will be reported if present. A detailed project report will accompany your high quality, chromosome-scale assembly. Post-project discussion with your project manager is available upon request.

A. ipaensis & A. duranensis

Dovetail Hi-C produced incredible scaffold ordering for our assemblies of the diploid ancestral species of peanut, A. duranensis and A. ipaensis. The new orderings made even clearer the incredibly close relationships between the genomes of these diploid species and the tetraploid crop.

David Bertioli, University of Georgia


Delivery Time Inquire
Sample Requirement An input draft assembly with >100 kb N50 and most contigs over 1Kb, a tissue or DNA sample of sufficient quality, and any species-specific information (e.g. predicted genome size).
Library Dovetail Omni-C Library
Sequencing Platform Illumina
Analysis Platform HiRise (scaffolding)
Potential Project Deliverables Final Dovetail Genome Assembly

  • A manifest detailing the contents of each file included in the delivery package.
  • The HiRise assembly in FASTA format.
  • A report with summary results for the assembly
  • A table detailing the breaks made to the input scaffolds
  • A table describing the position of the input assembly scaffolds within the HiRise scaffolds
  • BAM files
  • VCF files containing SNP phase information

Chromatin Topology TAD analysis

  • Manifest – A manifest detailing the contents of each file included in the delivery package
  • Report.html – Summary statistics of the analysis, data processing information, and instructions on HiGlass browser visualization
  • alignment.bam – File containing sequence alignment data
  • X.mcool – Multiple cooler file containing the Dovetail® Hi-C matrix of proximity
  • X.hic – HiC contact matrix at multiple resolutions in .hic format
  • X_isochores.bedpe – Output of program which calls isochores – regions of characteristic GC content within a genome
  • X.multires – Files which can be ingested in HiGlass viewer
  • Chr_sizes.txt – Chromosome size file – first column is chromosome name and the second the size of that chromosome
  • X_AB_compartments.bedpe – A/B compartments from first Eigenvector of contact matrix
  • X_CTCF_sites.bed – Predicted CTCF binding sites using Cread
  • X_TADs_10000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 10,000 bp resolution
  • X_TADs_25000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 25,000 bp resolution
  • X_TADs_50000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 25,000 bp resolution


Case Studies


Tech Sheet

Scientific Literature