Skip to content

otb `results` structure

David Molik edited this page Nov 4, 2022 · 9 revisions

Getting to Know Your Results

when otb is run, a sub-dirctory called results will be created in the main directory. It has the following directory structure:

results/
├── 00_ordination
│   ├── genomescope
│   └── log
│       ├── filtering
│       └── genomescope
├── 01_hifiasm
│   ├── busco
│   └── log
├── 02_hicstuff
│   ├── hicstuff_out
│   │   └── plots
│   └── log
├── 03_polish
│   └── log
├── 04_yahs
│   └── log
├── 05_yahs_on_polish
│   └── log
└── software_versions

Each numeric directory corresponds to a step in the otb pipeline:

  • 00_ordination holds logs and outputs from getting the data ready, importantly genomescope outputs exist in this directory
  • 01_hifiasm holds outputs from running hifiasm, as well as busco for hifiasm
  • 02_hicstuff holds outputs from running hicstuff
  • 03_polish holds outputs from any polishing, normally variant reduction and scaffold combinations where possible
  • 04_yahs holds outputs from yahs being run on 01/02, the unpolished vresions
  • 05_yahs_on_polish holds outputs from yahs on polished outputs

A lot of the results are actually links, this is done to save space.

In software_versions, each tool has it's own version file containing information on the version of that tool used. Neodiprion_virginianus_male.bp.p_ctg.gfa.fasta is the final genome in this case. the prefix 'polished' will be used in polishes, and the genome will be in genome.out.fasta, otherwise the workflow will finish at p_ctg.gfa.fasta