Skip to content

otb `results` structure

David Molik edited this page Nov 18, 2021 · 9 revisions

Knowing Your Results

when otb is run, a sub-dirctory called results will be created in the main directory. It has the following directory structure:

results/
├── busco_no_polish
├── busco_polish
├── filtering
├── genome
│   └── log
├── genomescope
│   └── Genome Name
└── software_versions

In these directories you will find the following:

results/
├── busco_no_polish <- busco results from before any polishing
├── busco_polish <- busco results from after any polishing
├── filtering <- results of bam filtering
├── genome <- where the genome will be
│   └── log <- the log of any genome tools
├── genomescope <- genomescope results
│   └── <- subdirectory in genomescope holding plots and tables
└── software_versions <- the versions of all the software used

An example of such output might look like the following:

results/
├── busco_no_polish
├── busco_polish
├── filtering
│   ├── bam_check.log.txt
│   ├── fastq_check.log.txt
│   └── filtering_information.log.txt
├── genome
│   ├── left.fastq.gz.stats
│   ├── log
│   │   ├── gfa2fasta.log.txt
│   │   └── HiFiASM.log.txt
│   ├── Neodiprion_virginianus_male.bp.hap1.p_ctg.gfa.fasta -> ../../work/91/ecebcbc6fa09b44db3c3c923b363b4/Neodiprion_virginianus_male.bp.hap1.p_ctg.gfa.fasta
│   ├── Neodiprion_virginianus_male.bp.hap1.p_ctg.gfa.fasta.stats
│   ├── Neodiprion_virginianus_male.bp.hap2.p_ctg.gfa.fasta -> ../../work/4e/d0823000e32395544085af3bb64474/Neodiprion_virginianus_male.bp.hap2.p_ctg.gfa.fasta
│   ├── Neodiprion_virginianus_male.bp.p_ctg.gfa.fasta -> ../../work/2a/6ab4e7a893388680985f4bffb527d3/Neodiprion_virginianus_male.bp.p_ctg.gfa.fasta
│   └── right.fastq.gz.stats
├── genomescope
│   ├── genomescope2.log.txt
│   ├── jellyfish.log.txt
│   ├── kcov.txt -> ../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/kcov.txt
│   ├── Neodiprion_virginianus_male
│   │   ├── fitted_hist.png -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/fitted_hist.png
│   │   ├── linear_plot.png -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/linear_plot.png
│   │   ├── log_plot.png -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/log_plot.png
│   │   ├── lookup_table.txt -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/lookup_table.txt
│   │   ├── model.txt -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/model.txt
│   │   ├── progress.txt -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/progress.txt
│   │   ├── summary.txt -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/summary.txt
│   │   ├── transformed_linear_plot.png -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/transformed_linear_plot.png
│   │   └── transformed_log_plot.png -> ../../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/Neodiprion_virginianus_male/transformed_log_plot.png
│   └── version.txt -> ../../work/02/e0b42ea3bfe40a8fc06a9d55d14f17/version.txt
└── software_versions
    ├── any2fasta_version.txt
    ├── bbtools_version.txt
    ├── bcftools_version.txt
    ├── busco_version.txt
    ├── genomescope_version.txt
    ├── hicstuff_version.txt
    ├── hifiasm_version.txt
    ├── jellyfish_version.txt
    ├── pbadapterfilt_version.txt
    ├── ragtag_version.txt
    ├── samtools_version.txt
    └── shhquis_version.txt

What we can see is that a lot of the results are actually links, this is done to save space.

In software_versions, each tool has it's own version file containing information on the version of that tool used. Neodiprion_virginianus_male.bp.p_ctg.gfa.fasta is the final genome in this case. the prefix 'polished' will be used in polishes, and the genome will be in genome.out.fasta, otherwise the workflow will finish at p_ctg.gfa.fasta