Skip to content

Commit

Permalink
Updated usage doc
Browse files Browse the repository at this point in the history
  • Loading branch information
GallVp committed Mar 3, 2024
1 parent a4c6490 commit 36f102f
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 13 deletions.
3 changes: 1 addition & 2 deletions bin/report_modules/templates/kraken2/kraken2.html
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
<div id="KRAKEN2" class="tabcontent" style="display: none">
<div class="section-para-wrapper">
<p class="section-para">
Kraken2 assigns taxonomic labels to sequencing reads for metagenomics projects. It can also be used to
detect contamination in genome assemblies.
Kraken2 assigns taxonomic labels to sequencing reads for metagenomics projects.
</p>
<p class="section-para"><b>Reference:</b></p>
<p class="section-para">
Expand Down
Binary file added docs/images/kraken2.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 18 additions & 3 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Introduction

This document describes the output produced by the pipeline. Most of the plots are taken from the AssemblyQC report, which summarises results at the end of the pipeline.
This document describes the output produced by the pipeline. Most of the plots are taken from the AssemblyQC report which summarises results at the end of the pipeline.

The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.

Expand Down Expand Up @@ -79,8 +79,8 @@ GenomeTools `gt stat` tool calculates a basic set of statistics about features c
- `*.taxonomy.rpt`: [Taxonomy report](https://github.com/ncbi/fcs/wiki/FCS-GX-taxonomy-report#taxonomy-report-output-).
- `*.fcs_gx_report.txt`: A final report of [recommended actions](https://github.com/ncbi/fcs/wiki/FCS-GX#outputs).
- `*.inter.tax.rpt.tsv`: [Select columns](../modules/local/ncbi_fcs_gx_krona_plot.nf) from `*.taxonomy.rpt` used for generation of a Krona taxonomy plot.
- `*.fcs.gx.krona.cut`: Krona taxonomy file [created](../modules/local/ncbi_fcs_gx_krona_plot.nf) from `*.inter.tax.rpt.tsv`.
- `*.fcs.gx.krona.html`: Krona taxonomy plot.
- `*.fcs.gx.krona.cut`: Taxonomy file for Krona plot [created](../modules/local/ncbi_fcs_gx_krona_plot.nf) from `*.inter.tax.rpt.tsv`.
- `*.fcs.gx.krona.html`: Interactive Krona taxonomy plot.

</details>

Expand Down Expand Up @@ -139,6 +139,21 @@ LTR Assembly Index (LAI) is a reference-free genome metric that [evaluates assem

### Kraken2

<details markdown="1">
<summary>Output files</summary>

- `kraken2/`
- `*.kraken2.report`: [Kraken2 report](https://github.com/DerrickWood/kraken2/wiki/Manual#output-formats).
- `*.kraken2.cut`: [Kraken2 output](https://github.com/DerrickWood/kraken2/wiki/Manual#output-formats).
- `*.kraken2.krona.cut`: [Select columns](../modules/local/kraken2_krona_plot.nf) from `*.kraken2.cut` used for generation of a Krona taxonomy plot.
- `*.kraken2.krona.html`: Interactive Krona taxonomy plot.

</details>

Kraken2 [assigns taxonomic labels](https://ccb.jhu.edu/software/kraken2/) to sequencing reads for metagenomics projects.

<div align="center"><img src="images/kraken2.jpg" alt="AssemblyQC - Interactive Krona plot from Kraken2 taxonomy" width="50%"><hr><em>AssemblyQC - Interactive Krona plot from Kraken2 taxonomy</em></div>

### HiC contact map

<details markdown="1">
Expand Down
15 changes: 7 additions & 8 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ You will need to create an assemblysheet with information about the assemblies y
- `fasta:` FASTA file
- `gff3 [Optional]:` GFF3 annotation file if available
- `monoploid_ids [Optional]:` A txt file listing the IDs used to calculate LAI in monoploid mode if necessary
- `synteny_labels [Optional]:` A two column tsv file listing fasta sequence ids (first column) and labels for the synteny plots (second column) when performing synteny analysis
- `synteny_labels [Optional]:` A two column tsv file listing fasta sequence ids (first column) and their labels for the synteny plots (second column) when performing synteny analysis

## External databases

Expand Down Expand Up @@ -40,7 +40,7 @@ BUSCO lineage databases are downloaded and updated by the BUSCO tool itself. A p

### Assemblathon stats

`assemblathon_stats_n_limit` is the number of 'N's for the unknown gap size. This number is used to split the scaffolds into contigs to compute contig-related stats. NCBI's recommendation for unknown gap size is 100 <https://www.ncbi.nlm.nih.gov/genbank/>.
`assemblathon_stats_n_limit` is the number of 'N's for the unknown gap size. This number is used to split the scaffolds into contigs to compute contig-related stats. NCBI's recommendation for unknown gap size is 100 <https://www.ncbi.nlm.nih.gov/genbank/wgs_gapped/>.

### NCBI FCS adaptor

Expand All @@ -64,8 +64,8 @@ BUSCO lineage databases are downloaded and updated by the BUSCO tool itself. A p
### HiC

- `hic`: Path to reads provided as a SRA ID or as a path to paired reads with pattern '\*{1,2}.(fastq|fq).gz'
- `hic_skip_fastp`: Skips fastp trimming
- `hic_skip_fastqc`: Skips QC by fastqc
- `hic_skip_fastp`: Skip fastp trimming
- `hic_skip_fastqc`: Skip QC by fastqc
- `hic_fastp_ext_args`: Additional arguments for fastp (default: '--qualified_quality_phred 20 --length_required 50')

### Synteny analysis
Expand All @@ -79,7 +79,7 @@ BUSCO lineage databases are downloaded and updated by the BUSCO tool itself. A p
- `synteny_xref_assemblies`: Similar to `--input`, this parameter also provides a CSV sheet listing external reference assemblies which are included in the synteny analysis but are not analysed by other QC tools. See the [example xrefsheet](../assets/xrefsheet.csv) included with the pipeline. Its fields are:
- `tag:` A unique tag which represents the reference assembly in the final report
- `fasta:` FASTA file
- `synteny_labels:` A two column tsv file listing fasta sequence ids (first column) and labels for the synteny plots (second column)
- `synteny_labels:` A two column tsv file listing fasta sequence ids (first column) and their labels for the synteny plots (second column)

## Running the pipeline

Expand Down Expand Up @@ -116,9 +116,8 @@ nextflow run plant-food-research-open/assemblyqc -profile docker -params-file pa
with `params.yaml` containing:

```yaml
input: './samplesheet.csv'
outdir: './results/'
<...>
input: "./assemblysheet.csv"
outdir: "./results/"
```
You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch).
Expand Down

0 comments on commit 36f102f

Please sign in to comment.