De Novo Genome Assembly Services

Catalog # 20012

De Novo Genome Assembly Services

Catalog # 20012

Sometimes the fastest way to get something done is to have an expert do it for you. We’re here to help you build a full de novo assembly from scratch.

We first build a draft assembly using PacBio HiFi, the most accurate long-read sequencing platform available today. This assembly is then scaffolded up to chromosome-scale using Omni-C® proximity ligation technology and HiRise® scaffolding software. Unlike traditional Hi-C methods that utilize a restriction enzyme(s), Omni-C digests chromatin using a sequence-independent endonuclease for even, unbiased whole genome coverage. This unique combination of PacBio HiFi (accuracy) plus Omni-C (coverage) enables chromosome-scale phased SNP calling, giving you an enriched assembly dataset. Finally, we annotate your assembly to call and label as many genes as possible.

Why Use Our Service

Trust your project to the most experienced de novo assembly service provider in the world, with over 1,600 assemblies and counting.

  • Enjoy full service – from sample to publishable assembly under one roof.
  • Get access to the latest cutting-edge technologies: PacBio HiFi, Dovetail® Omni-C, HiRise software algorithm, and the Dovetail® Annotation Service Suite.
  • Receive a full genome: don’t settle for half a genome – new chromosomal-scale phased SNP calling for 2021 assemblies!

How It Works

Just send us your flash frozen tissue sample and we will deliver a complete, highly accurate and contiguous annotated genome assembly. A dedicated project manager will keep tabs on your project at all times and be available for pre-and post-project discussions. Quality control checks at every step ensure assembly accuracy.

Experience sets Dovetail apart from the rest. Each species represents a novel assembly project – every species possesses a unique genome. However, our experience having assembled more than 1,600 species’ genomes means we typically know what to expect going into a project. Plus, we can refer to similar species we have assembled and ensure that optimal taxon-protocols are followed.


Dovetail took what seemed like an impossible project — getting an assembly for a plant with a large, repetitive genome — and made it a reality. They collaborated with different bioinformatics groups & did additional analyses to make sure we got the best results possible.

Sonal Singhal, California State University, Dominguez Hills


Delivery Time Inquire
Sample Requirement A tissue or DNA sample of sufficient quality, and any species-specific information (e.g. predicted genome size).
Library Dovetail Omni-C, Dovetail Hi-C and/or Chicago® Library
Sequencing Platform PacBio and Illumina
Analysis Platform Falcon (PacBio) and HiRise (scaffolding)
Potential Project Deliverables Final Dovetail Genome Assembly
Genome Assembly

  • A manifest detailing the contents of each file included in the delivery package.
  • The HiRise assembly in FASTA format.
  • A report with summary results for the assembly
  • A table detailing the breaks made to the input scaffolds
  • A table describing the position of the input assembly scaffolds within the HiRise scaffolds
  • BAM files
  • VCF files containing SNP phase information

Chromatin Topology TAD analysis

  • Manifest – A manifest detailing the contents of each file included in the delivery package
  • Report.html – Summary statistics of the analysis, data processing information, and instructions on HiGlass browser visualization
  • alignment.bam – File containing sequence alignment data
  • X.mcool – Multiple cooler file containing the Dovetail™ Hi-C matrix of proximity
  • X.hic – HiC contact matrix at multiple resolutions in .hic format
  • X_isochores.bedpe – Output of program which calls isochores – regions of characteristic GC content within a genome
  • X.multires – Files which can be ingested in HiGlass viewer
  • Chr_sizes.txt – Chromosome size file – first column is chromosome name and the second the size of that chromosome
  • X_AB_compartments.bedpe – A/B compartments from first Eigenvector of contact matrix
  • X_CTCF_sites.bed – Predicted CTCF binding sites using Cread
  • X_TADs_10000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 10,000 bp resolution
  • X_TADs_25000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 25,000 bp resolution
  • X_TADs_50000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 25,000 bp resolution


Case Studies


Scientific Literature