Dovetail™ De Novo Genome Assembly Project

Catalogue #: 20012

Request a Quote
Daniel_new_kitBoxes

Specifications

Delivery Time

18 weeks

Library

Dovetail™ Hi-C and/or Chicago® Library

Sequencing Platform

PacBio or 10X Genomics (de novo) and Illumina

Analysis Platform

  Falcon (PacBio), Supernova (10XG),
and HiRise™ (scaffolding)

Service Description

Dovetail™ de novo Assembly Projects are foundational to any type of genomic research. For any plant and animal species, let Dovetail Genomics produce a full length genome assembly for you in our state-of-the-art facility. With over 1,000 assemblies to date, avoid the frustrations and challenges by letting the experts in genome assembly work for you.

  • Tailored approach to assembly using proven methodologies
  • Designated Scientific Project Manager to answer questions and provide project guidance
  • Usable genome assembly within 18 weeks of successful DNA extraction
  • Highly accurate results that have been validated by multiple technology types
  • Access to Dovetail’s proprietary service offerings and expertise
  • High molecular weight DNA extraction
  • Initial draft assembly construction (includes de novo library preparation, sequencing, and assembly performed using technologies from Pacific Biosciences or 10X Genomics)
  • Dovetail™ library construction and sequencing (includes specialized combinations of our proprietary Chicago™ or Dovetail™ Hi-C library preparation methods and sequenced on an Illumina sequencing system)
  • Final full genome length assembly including informatics (scaffolding with our HiRise™ Assembly Pipeline and genome TAD analysis, when applicable)

PacBio de novo assemblies are created using PacBio SMRTbell libraries created in accordance with the manufacturers recommendation and sequenced on the Sequel II System.

10x Chromium genome libraries are created and sequenced on an Illumina system. The data generated is assembled using Supernova 2 to create the de novo assembly.

Chicago™ Libraries use a combination of in vitro chromatin fixation, digestion and crosslink reversal to create a library type unique to Dovetail Genomics. This library type has been shown to improve ordering, orientation and contiguity even in highly accurate assemblies.

Dovetail™ Hi-C Libraries use a single restriction enzyme (DpnII) for chromatin digestion prior to proximity ligation. This library uses proven Hi-C chemistry well accepted in the genome assembly and chromatin conformation research fields.

The HiRise™ Scaffolding Pipeline uses the de novo assembly as an input. Multiple iterations are completed layering the use of the library sequencing data to provide a contiguous high-quality assembly.

When available, TAD analysis uses the data collected from the assembly to provide a view into chromatin 3-D architecture, including topologically associated domains (TADs).

Materials & Deliverables

A tissue or DNA sample of sufficient quality and any species-specific information (e.g. predicted genome size)

  • Final Dovetail™ Genome Assembly
    • A manifest detailing the contents of each file included in the delivery package.
    • The HiRise assembly in FASTA format.
    • A report with summary results for the assembly
    • A table detailing the breaks made to the input scaffolds.
    • A table describing the position of the input assembly scaffolds within the HiRise scaffolds.
    • BAM files
  • TAD analysis
    • Manifest – A manifest detailing the contents of each file included in the delivery package
    • Report.html – Summary statistics of the analysis, data processing information, and instructions on HiGlass browser visualization
    • alignment.bam – File containing sequence alignment data
    • X.mcool – Multiple cooler file containing the Dovetail™ Hi-C matrix of proximity
    • X.hic – HiC contact matrix at multiple resolutions in .hic format
    • X_isochores.bedpe – Output of program which calls isochores – regions of characteristic GC content within a genome
    • X.multires – Files which can be ingested in HiGlass viewer
    • Chr_sizes.txt – Chromosome size file – first column is chromosome name and the second the size of that chromosome
    • X_AB_compartments.bedpe – A/B compartments from first Eigenvector of contact matrix
    • X_CTCF_sites.bed – Predicted CTCF binding sites using Cread
    • X_TADs_10000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 10,000 bp resolution
    • X_TADs_25000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 25,000 bp resolution
    • X_TADs_50000.bedpe – Topologically associated domain (TAD) calls using arrowhead at 50,000 bp resolution

Documents

Request a Quote