Verkko v1.4
These are release notes for Verkko version 1.4, which was released on June 27rd, 2023. Verkko is a hybrid genome assembly pipeline developed for telomere-to-telomere assembly of accurate long reads (PacBio HiFi or Oxford Nanopore Duplex) and Oxford Nanopore ultra-long reads.
The source code distribution contains everything you need to create a binary distribution for your own specific OS. Please report any issues you encounter.
Citation
- Rautiainen M, Nurk S, Walenz BP, Logsdon GA, Porubsky D, Rhie A, Eichler EE, Phillippy AM, Koren S. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat Biotech. (2023). doi:10.1038/s41587-023-01662-6
Minimum Requirements
- 8GB minimum memory; 16GB strongly suggested
- GCC 9 or newer (for compilation only)
- Rust 1.58 or newer (for compilation only)
- Python 3.5 or newer, with parasail module
- Snakemake 7.0 or newer
- Mashmap 2.0 or newer (for filtering known sequences and HiC)
- GraphAligner v1.0.17 or newer
- WinnowMap
- MBG v1.0.14 or newer
Installation
Users can download Verkko as source code or installed through a package manager like conda. The source code package needs to be compiled and installed before it can be used. Do NOT download the .zip source code. It is missing files and will not compile. This is a known flaw with git itself.
Run either:
conda install -c conda-forge -c bioconda -c defaults verkko
or build from source
curl -L https://github.com/marbl/verkko/releases/download/v1.4/verkko-v1.4.tar.gz --output verkko-v1.4.tar.gz
tar -xzf verkko-v1.4.tar.gz
cd verkko-v1.4/src
make -j 8
cd ..
Confirm the MD5 for the tar.gz matches expected:
6cc9374d06c5150bf055c5667d81592e verkko-v1.4.tar.gz
Verkko will be installed in verkko-v1.4/bin
. You can move the contents to verkko-v1.4/bin/*
and verko-v1.4/lib/*
to a central location if you would like. If GraphAligner, MBG, Winnowmap or mashmap are not available in your path, you may also symlink them under verkko/lib/verkko/bin/
See the README for more details.
Changes
- Beta support for Hi-C phasing integration, see the --hic1 and --hic2 options in the README and command line help for details.
- Improved performance on low-coverage (<50x HiFi/ONT) samples (lower switch/hamming error rates, more T2T contigs and scaffolds)
- Added --haploid option to more aggressively pop bubbles on samples which are expected to be homozygous.
Bug Fixes
- Avoid losing low-coverage long nodes when simplifying the graph
- Various MBG crashes (#126, #145, #130, #131, #146)
- Various pipeline errors (#147, #141)
Known Issues
See the issues page for up-to date open issues, or to report a problem.
- Long runtime of MBG with very high HiFi coverage (>200x). We recommend downsampling to 100x.
- Lost heterozygosity in simple-sequence repeats in low-heterozygosity samples. When there is no other variation within at most 1 HiFi read length away, the simple sequence repeat difference will be ignored and a consensus of both haplotypes is produced. This will be addressed in a future release.
Legal
See the README.licenses file and individual source code files for details.