Updated usage doc

Plant-Food-Research-Open · Mar 3, 2024 · 36f102f · 36f102f
1 parent a4c6490
commit 36f102f
Show file tree

Hide file tree

Showing 4 changed files with 26 additions and 13 deletions.
diff --git a/bin/report_modules/templates/kraken2/kraken2.html b/bin/report_modules/templates/kraken2/kraken2.html
@@ -1,8 +1,7 @@
 <div id="KRAKEN2" class="tabcontent" style="display: none">
     <div class="section-para-wrapper">
         <p class="section-para">
-            Kraken2 assigns taxonomic labels to sequencing reads for metagenomics projects. It can also be used to
-            detect contamination in genome assemblies.
+            Kraken2 assigns taxonomic labels to sequencing reads for metagenomics projects.
         </p>
         <p class="section-para"><b>Reference:</b></p>
         <p class="section-para">

diff --git a/docs/images/kraken2.jpg b/docs/images/kraken2.jpg
diff --git a/docs/output.md b/docs/output.md
@@ -2,7 +2,7 @@
 
 ## Introduction
 
-This document describes the output produced by the pipeline. Most of the plots are taken from the AssemblyQC report, which summarises results at the end of the pipeline.
+This document describes the output produced by the pipeline. Most of the plots are taken from the AssemblyQC report which summarises results at the end of the pipeline.
 
 The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.
 
@@ -79,8 +79,8 @@ GenomeTools `gt stat` tool calculates a basic set of statistics about features c
   - `*.taxonomy.rpt`: [Taxonomy report](https://github.com/ncbi/fcs/wiki/FCS-GX-taxonomy-report#taxonomy-report-output-).
   - `*.fcs_gx_report.txt`: A final report of [recommended actions](https://github.com/ncbi/fcs/wiki/FCS-GX#outputs).
   - `*.inter.tax.rpt.tsv`: [Select columns](../modules/local/ncbi_fcs_gx_krona_plot.nf) from `*.taxonomy.rpt` used for generation of a Krona taxonomy plot.
-  - `*.fcs.gx.krona.cut`: Krona taxonomy file [created](../modules/local/ncbi_fcs_gx_krona_plot.nf) from `*.inter.tax.rpt.tsv`.
-  - `*.fcs.gx.krona.html`: Krona taxonomy plot.
+  - `*.fcs.gx.krona.cut`: Taxonomy file for Krona plot [created](../modules/local/ncbi_fcs_gx_krona_plot.nf) from `*.inter.tax.rpt.tsv`.
+  - `*.fcs.gx.krona.html`: Interactive Krona taxonomy plot.
 
 </details>
 
@@ -139,6 +139,21 @@ LTR Assembly Index (LAI) is a reference-free genome metric that [evaluates assem
 
 ### Kraken2
 
+<details markdown="1">
+<summary>Output files</summary>
+
+- `kraken2/`
+  - `*.kraken2.report`: [Kraken2 report](https://github.com/DerrickWood/kraken2/wiki/Manual#output-formats).
+  - `*.kraken2.cut`: [Kraken2 output](https://github.com/DerrickWood/kraken2/wiki/Manual#output-formats).
+  - `*.kraken2.krona.cut`: [Select columns](../modules/local/kraken2_krona_plot.nf) from `*.kraken2.cut` used for generation of a Krona taxonomy plot.
+  - `*.kraken2.krona.html`: Interactive Krona taxonomy plot.
+
+</details>
+
+Kraken2 [assigns taxonomic labels](https://ccb.jhu.edu/software/kraken2/) to sequencing reads for metagenomics projects.
+
+<div align="center"><img src="images/kraken2.jpg" alt="AssemblyQC - Interactive Krona plot from Kraken2 taxonomy" width="50%"><hr><em>AssemblyQC - Interactive Krona plot from Kraken2 taxonomy</em></div>
+
 ### HiC contact map
 
 <details markdown="1">

diff --git a/docs/usage.md b/docs/usage.md
@@ -8,7 +8,7 @@ You will need to create an assemblysheet with information about the assemblies y
 - `fasta:` FASTA file
 - `gff3 [Optional]:` GFF3 annotation file if available
 - `monoploid_ids [Optional]:` A txt file listing the IDs used to calculate LAI in monoploid mode if necessary
-- `synteny_labels [Optional]:` A two column tsv file listing fasta sequence ids (first column) and labels for the synteny plots (second column) when performing synteny analysis
+- `synteny_labels [Optional]:` A two column tsv file listing fasta sequence ids (first column) and their labels for the synteny plots (second column) when performing synteny analysis
 
 ## External databases
 
@@ -40,7 +40,7 @@ BUSCO lineage databases are downloaded and updated by the BUSCO tool itself. A p
 
 ### Assemblathon stats
 
-`assemblathon_stats_n_limit` is the number of 'N's for the unknown gap size. This number is used to split the scaffolds into contigs to compute contig-related stats. NCBI's recommendation for unknown gap size is 100 <https://www.ncbi.nlm.nih.gov/genbank/>.
+`assemblathon_stats_n_limit` is the number of 'N's for the unknown gap size. This number is used to split the scaffolds into contigs to compute contig-related stats. NCBI's recommendation for unknown gap size is 100 <https://www.ncbi.nlm.nih.gov/genbank/wgs_gapped/>.
 
 ### NCBI FCS adaptor
 
@@ -64,8 +64,8 @@ BUSCO lineage databases are downloaded and updated by the BUSCO tool itself. A p
 ### HiC
 
 - `hic`: Path to reads provided as a SRA ID or as a path to paired reads with pattern '\*{1,2}.(fastq|fq).gz'
-- `hic_skip_fastp`: Skips fastp trimming
-- `hic_skip_fastqc`: Skips QC by fastqc
+- `hic_skip_fastp`: Skip fastp trimming
+- `hic_skip_fastqc`: Skip QC by fastqc
 - `hic_fastp_ext_args`: Additional arguments for fastp (default: '--qualified_quality_phred 20 --length_required 50')
 
 ### Synteny analysis
@@ -79,7 +79,7 @@ BUSCO lineage databases are downloaded and updated by the BUSCO tool itself. A p
 - `synteny_xref_assemblies`: Similar to `--input`, this parameter also provides a CSV sheet listing external reference assemblies which are included in the synteny analysis but are not analysed by other QC tools. See the [example xrefsheet](../assets/xrefsheet.csv) included with the pipeline. Its fields are:
   - `tag:` A unique tag which represents the reference assembly in the final report
   - `fasta:` FASTA file
-  - `synteny_labels:` A two column tsv file listing fasta sequence ids (first column) and labels for the synteny plots (second column)
+  - `synteny_labels:` A two column tsv file listing fasta sequence ids (first column) and their labels for the synteny plots (second column)
 
 ## Running the pipeline
 
@@ -116,9 +116,8 @@ nextflow run plant-food-research-open/assemblyqc -profile docker -params-file pa
 with `params.yaml` containing:
 
 ```yaml
-input: './samplesheet.csv'
-outdir: './results/'
-<...>
+input: "./assemblysheet.csv"
+outdir: "./results/"
 ```
 
 You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch).