diff --git a/docs/output.md b/docs/output.md index 27c70b9..76fa62d 100644 --- a/docs/output.md +++ b/docs/output.md @@ -41,7 +41,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - `/` - `fastq/` - `trimmed_nanofilt/` - - `*_filtered.fastq.gz` + - `*_filtered.fastq.gz`: The post-trimmed fastq. By default this will be mostly quality trimmed. @@ -54,10 +54,10 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - `/` - `blaze/` - - `blaze/*.bc_count.txt` - - `blaze/*.knee_plot.png` - - `blaze/*.putative_bc.csv` - - `blaze/*.whitelist.csv` + - `blaze/*.bc_count.txt` : This is a file containing each barcode and the counts of how many reads support it. + - `blaze/*.knee_plot.png` : The knee plot detailing the ranking of each barcode. + - `blaze/*.putative_bc.csv` : This file contains the naively detected barcode for each read. + - `blaze/*.whitelist.csv` : This is the detected "true" barcodes for the dataset. @@ -74,8 +74,8 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - `/` - `bam/` - `original/` - - `*.sorted.bam` - - `*.sorted.bam.bai` + - `*.sorted.bam` : The mapped and sorted bam. + - `*.sorted.bam.bai` : The bam index for the mapped and sorted bam. [Minimap2](https://github.com/lh3/minimap2) is a versatile sequence alignment program that aligns DNA or mRNA sequences against a large reference database. Minimap2 is optimized for large, noisy reads making it a staple for alignment of nanopore reads @@ -88,26 +88,26 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - `/` - `bam/` - `mapped_only/` - - `*.sorted.bam` - - `*.sorted.bam.bai` + - `*.sorted.bam` : The bam contaning only reads that were able to be mapped. + - `*.sorted.bam.bai` : The bam index for the bam containing only reads that were able to be mapped. - `qc/` - `samtools/` - `minimap/` - - `*.minimap.flagstat` - - `*.minimap.idxstats` - - `*.minimap.stats` + - `*.minimap.flagstat` : The flagstat file for the bam obtained from minimap. + - `*.minimap.idxstats` : The idxstats file for the bam obtained from minimap. + - `*.minimap.stats` : The stats file for the bam obtained from minimap. - `mapped_only/` - - `*.mapped_only.flagstat` - - `*.mapped_only.idxstats` - - `*.mapped_only.stats` + - `*.mapped_only.flagstat` : The flagstat file for the bam containing only mapped reads. + - `*.mapped_only.idxstats` : The idxstats file for the bam containing only mapped reads. + - `*.mapped_only.stats` : The stats file for the bam containing only mapped reads. - `corrected/` - - `*.corrected.flagstat` - - `*.corrected.idxstats` - - `*.corrected.stats` + - `*.corrected.flagstat` : The flagstat file for the bam containing corrected barcodes. + - `*.corrected.idxstats` : The idxstat file for the bam containing corrected barcodes. + - `*.corrected.stats` : The stat file for the bam containing corrected barcodes. - `dedup/` - - `*.dedup.flagstat` - - `*.dedup.idxstats` - - `*.dedup.stats` + - `*.dedup.flagstat` : The flagstat file for the bam containing deduplicated umis. + - `*.dedup.idxstats` : The idxstats file for the bam containing deduplicated umis. + - `*.dedup.stats` : The stats file for the bam containing deduplicated umis. ![MultiQC - samtools idxstats](images/samtools_idxstats.png) @@ -122,8 +122,8 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d - `/` - `bam/` - `corrected/` - - `*.corrected.bam` - - `*.corected.bam.bai` + - `*.corrected.bam` : The bam containing corrected barcodes. + - `*.corected.bam.bai` : The bam index for the bam containing corrected barcodes. @@ -136,8 +136,8 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE - `/` - `bam/` - `dedup/` - - `*.dedup.bam` - - `*.dedup.bam.bai` + - `*.dedup.bam` : The bam containing corrected barcodes and deduplicated umis. + - `*.dedup.bam.bai` : The bam index for the bam containing corrected barcodes and deduplicated umis. @@ -152,8 +152,8 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE - `/` - `isoquant/` - - `*.gene_counts.tsv` - - `*.transcript_counts.tsv` + - `*.gene_counts.tsv` : The feature-barcode matrix from gene quantification. + - `*.transcript_counts.tsv` : The feature-barcode matrix from transcript quantification. @@ -166,11 +166,11 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE - `/` - `qc/` - `gene/` - - `*.csv` - - `*.png` + - `*.csv`: A file containing statistics about the cell-read distribution for genes. + - `*.png`: A series of qc images to determine the quality of the gene quantification. - `transcript/` - - `*.csv` - - `*.png` + - `*.csv`: A file containing statistics about the cell-read distribution for transcript. + - `*.png`: A series of qc images to determine the quality of the transcript quantification. ![MultiQC - seurat](images/samtools_idxstats.png) @@ -214,38 +214,11 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE - `batch_qcs/` - `nanocomp/` - - `fastq/` - - `NanoComp_*.log` - - `NanoComp_lengths_violin.html` - - `NanoComp_log_length_violin.html` - - `NanoComp_N50.html` - - `NanoComp_number_of_reads.html` - - `NanoComp_OverlayHistogram.html` - - `NanoComp_OverlayHistogram_Normalized.html` - - `NanoComp_OverlayLogHistogram.html` - - `NanoComp_OverlayLogHistogram_Normalized.html` - - `NanoComp_quals_violin.html` - - `NanoComp-report.html` - - `NanoComp_total_throughput.html` - - `NanoStats.txt` - - `bam/` - - `NanoComp_20240212_1942.log` - - `NanoComp_lengths_violin.html` - - `NanoComp_log_length_violin.html` - - `NanoComp_N50.html` - - `NanoComp_number_of_reads.html` - - `NanoComp_OverlayHistogram.html` - - `NanoComp_OverlayHistogram_Identity.html` - - `NanoComp_OverlayHistogram_Normalized.html` - - `NanoComp_OverlayHistogram_PhredScore.html` - - `NanoComp_OverlayLogHistogram.html` - - `NanoComp_OverlayLogHistogram_Normalized.html` - - `NanoComp_percentIdentity_violin.html` - - `NanoComp_quals_violin.html` - - `NanoComp-report.html` - - `NanoComp_total_throughput.html` - - `NanoStats.txt` - + - `fastq/` and `bam/` + - `NanoComp_*.log`: This is the log file detailing the nanocomp run. + - `NanoComp-report.html` - This is browser-viewable report that contains all the figures in a single location. + - `*.html`: Nanocomp outputs all the figures in the report as individual files that can be inspected separately. + - `NanoStats.txt`: This file contains quality control statistics about the dataset. @@ -261,43 +234,12 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE - `/` - `qc/` - `nanoplot/` - - `pre_trim/` - - `LengthvsQualityScatterPlot_dot.html` - - `LengthvsQualityScatterPlot_kde.html` - - `NanoPlot_20240212_1033.log` - - `NanoPlot-report.html` - - `NanoStats_post_filtering.txt` - - `NanoStats.txt` - - `Non_weightedHistogramReadlength.html` - - `Non_weightedLogTransformed_HistogramReadlength.html` - - `WeightedHistogramReadlength.html` - - `WeightedLogTransformed_HistogramReadlength.html` - - `Yield_By_Length.html` - - `post_trim/` - - `LengthvsQualityScatterPlot_dot.html` - - `LengthvsQualityScatterPlot_kde.html` - - `NanoPlot_20240212_1033.log` - - `NanoPlot-report.html` - - `NanoStats_post_filtering.txt` - - `NanoStats.txt` - - `Non_weightedHistogramReadlength.html` - - `Non_weightedLogTransformed_HistogramReadlength.html` - - `WeightedHistogramReadlength.html` - - `WeightedLogTransformed_HistogramReadlength.html` - - `Yield_By_Length.html` - - `post_extract/` - - `LengthvsQualityScatterPlot_dot.html` - - `LengthvsQualityScatterPlot_kde.html` - - `NanoPlot_20240212_1033.log` - - `NanoPlot-report.html` - - `NanoStats_post_filtering.txt` - - `NanoStats.txt` - - `Non_weightedHistogramReadlength.html` - - `Non_weightedLogTransformed_HistogramReadlength.html` - - `WeightedHistogramReadlength.html` - - `WeightedLogTransformed_HistogramReadlength.html` - - `Yield_By_Length.html` - + - `pre_trim/` and `post_trim/` and `post_extract` + - `NanoPlot_*.log`: This is the log file detailing the nanoplot run + - `NanoPlot-report.html` - This is browser-viewable report that contains all the figures in a single location. + - `*.html`: Nanoplot outputs all the figures in the report as individual files that can be inspected separately. + - `NanoStats.txt`: This file contains quality control statistics about the dataset. + - `NanoStats_post_filtering.txt`: If any filtering metrics are used for nanoplot this will contain the differences. This is produced by default and should contain no differences from `NanoStats.txt` if the process was unmodified @@ -314,7 +256,7 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE - `/` - `qc/` - `rseqc/` - - `*.read_distribution.txt` + - `*.read_distribution.txt`: This file contains statisitics noting the type of reads located within the dataset