diff --git a/book/src/SUMMARY.md b/book/src/SUMMARY.md index 7ffc48e..40f478b 100644 --- a/book/src/SUMMARY.md +++ b/book/src/SUMMARY.md @@ -8,13 +8,13 @@ - [Extracting read information to a table](./intro_extract.md) - [Calling mods in a modBAM](./intro_call_mods.md) - [Removing modification calls at the ends of reads](./intro_edge_filter.md) - - [Narrow output to specific positions](./intro_include_bed.md) - [Repair MM/ML tags on trimmed reads](./intro_repair.md) - [Make hemi-methylation bedMethyl tables](./intro_pileup_hemi.md) - [Perform differential methylation scoring](./intro_dmr.md) - [Validate ground truth results](./intro_validate.md) - [Find highly modified motif sequences](./intro_find_motifs.md) - [Calculating methylation entropy](./intro_entropy.md) + - [Narrow output to specific positions](./intro_include_bed.md) - [Extended subcommand help](./advanced_usage.md) - [Troubleshooting](./troubleshooting.md) - [Current limitations](./limitations.md) diff --git a/book/src/intro_bedmethyl.md b/book/src/intro_bedmethyl.md index 6662c47..39f2eee 100644 --- a/book/src/intro_bedmethyl.md +++ b/book/src/intro_bedmethyl.md @@ -163,20 +163,20 @@ CG->CH substitution such that no modification call was produced by the basecalle | 2 | start position | 0-based start position | int | | 3 | end position | 0-based exclusive end position | int | | 4 | modified base code and motif | single letter code for modified base and motif when more than one motif is used | str | -| 5 | score | Equal to Nvalid_cov. | int | +| 5 | score | equal to Nvalid_cov | int | | 6 | strand | '+' for positive strand '-' for negative strand, '.' when strands are combined | str | | 7 | start position | included for compatibility | int | | 8 | end position | included for compatibility | int | | 9 | color | included for compatibility, always 255,0,0 | str | -| 10 | Nvalid_cov | See definitions above. | int | +| 10 | Nvalid_cov | see definitions above. | int | | 11 | percent modified | (Nmod / Nvalid_cov) * 100 | float | -| 12 | Nmod | See definitions above. | int | -| 13 | Ncanonical | See definitions above. | int | -| 14 | Nother_mod | See definitions above. | int | -| 15 | Ndelete | See definitions above. | int | -| 16 | Nfail | See definitions above. | int | -| 17 | Ndiff | See definitions above. | int | -| 18 | Nnocall | See definitions above. | int | +| 12 | Nmod | see definitions above | int | +| 13 | Ncanonical | see definitions above | int | +| 14 | Nother_mod | see definitions above | int | +| 15 | Ndelete | see definitions above | int | +| 16 | Nfail | see definitions above | int | +| 17 | Ndiff | see definitions above | int | +| 18 | Nnocall | see definitions above | int | ## Performance considerations diff --git a/book/src/intro_dmr.md b/book/src/intro_dmr.md index c9cf14f..f11e3ef 100644 --- a/book/src/intro_dmr.md +++ b/book/src/intro_dmr.md @@ -195,11 +195,11 @@ When performing single-site analysis, the following additional columns are added |--------|----------------------------|---------------------------------------------------------------------------------------|-------| | 14 | MAP-based p-value | ratio of the posterior probability of observing the effect size over zero effect size | float | | 15 | effect size | percent modified in sample A (col 12) minus percent modified in sample B (col 13) | float | -| 16 | balanced MAP-based p-value | mAP-based p-value when all replicates are balanced | float | +| 16 | balanced MAP-based p-value | MAP-based p-value when all replicates are balanced | float | | 17 | balanced effect size | effect size when all replicates are balanced | float | | 18 | pct_a_samples | percent of 'a' samples used in statistical test | float | | 19 | pct_b_samples | percent of 'b' samples used in statistical test | float | -| 20 | per-replicate p-values | mAP-based p-values for matched replicate pairs | float | +| 20 | per-replicate p-values | MAP-based p-values for matched replicate pairs | float | | 21 | per-replicate effect sizes | effect sizes matched replicate pairs | float | @@ -257,6 +257,7 @@ modkit dmr pair \ The default settings for the HMM are to run in "coarse-grained" mode which will more eagerly join neighboring sites, potentially at the cost of including sites that are not differentially modified within "Different" blocks. To activate "fine-grained" mode, pass the `--fine-grained` flag. + The output schema for the segments is: | column | name | description | type | @@ -266,7 +267,7 @@ The output schema for the segments is: | 3 | end position | 0-based exclusive end position, from `--regions` argument | int | | 4 | state-name | "different" when sites are differentially modified, "same" otherwise | str | | 5 | score | difference score, more positive values have increased difference | float | -| 6 | N_sites<\sub> | number of sites (bedmethyl records) in the segment | float | +| 6 | N-sites | number of sites (bedmethyl records) in the segment | float | | 7 | samplea counts | counts of each base modification in the region, comma-separated, for sample A | str | | 8 | samplea total | total number of base modification calls in the region, including unmodified, for sample A | str | | 9 | sampleb counts | counts of each base modification in the region, comma-separated, for sample B | str | diff --git a/book/src/intro_edge_filter.md b/book/src/intro_edge_filter.md index a050d72..98fa149 100644 --- a/book/src/intro_edge_filter.md +++ b/book/src/intro_edge_filter.md @@ -23,7 +23,7 @@ All commands have the flag `--invert-edge-filter` that will _keep_ only base mod ## Example usages -### call mods with the estimated threshold and ignore modification calls within 100 base pairs of the ends of the reads +### Call mods with the estimated threshold and ignore modification calls within 100 base pairs of the ends of the reads ``` modkit call-mods --edge-filter 100 ``` diff --git a/book/src/intro_find_motifs.md b/book/src/intro_find_motifs.md index 9b2c126..fb91bb7 100644 --- a/book/src/intro_find_motifs.md +++ b/book/src/intro_find_motifs.md @@ -28,7 +28,7 @@ The human-readable tables are always output to the log and terminal, the machine | column | name | description | type | |--------|------------|--------------------------------------------------------------------------------------------------------------------------|-------| | 1 | mod_code | code specifying the modification found in the motif | str | -0 2 | motif | sequence of identified motif using [IUPAC](https://www.bioinformatics.org/sms/iupac.html) codes | str | +| 2 | motif | sequence of identified motif using [IUPAC](https://www.bioinformatics.org/sms/iupac.html) codes | str | | 3 | offset | 0-based offset into the motif sequence of the modified base | int | | 4 | frac_mod | fraction of time this sequence is found in the _high modified_ set col-5 / (col-5 + col-6) | float | | 5 | high_count | number of occurances of this sequence in the _high-modified_ set | int | diff --git a/book/src/intro_pileup_hemi.md b/book/src/intro_pileup_hemi.md index 196ebcb..9c2471e 100644 --- a/book/src/intro_pileup_hemi.md +++ b/book/src/intro_pileup_hemi.md @@ -84,20 +84,20 @@ patterns (`-,-`). All patterns recognized at a location will be reported in the | 2 | start position | 0-based start position | int | | 3 | end position | 0-based exclusive end position | int | | 4 | methylation pattern | comma-separated pair of modification codes `-` means canonical, followed by the primary read base | str | -| 5 | score | Equal to Nvalid_cov. | int | +| 5 | score | equal to Nvalid_cov | int | | 6 | strand | always '.' because strand information is combined | str | | 7 | start position | included for compatibility | int | | 8 | end position | included for compatibility | int | | 9 | color | included for compatibility, always 255,0,0 | str | -| 10 | Nvalid_cov | See definitions above. | int | +| 10 | Nvalid_cov | see definitions above | int | | 11 | fraction modified | Npattern / Nvalid_cov | float | -| 12 | Npattern | See definitions above. | int | -| 13 | Ncanonical | See definitions above. | int | -| 14 | Nother_pattern | See definitions above. | int | -| 15 | Ndelete | See definitions above. | int | -| 16 | Nfail | See definitions above. | int | -| 17 | Ndiff | See definitions above. | int | -| 18 | Nnocall | See definitions above. | int | +| 12 | Npattern | see definitions above | int | +| 13 | Ncanonical | see definitions above | int | +| 14 | Nother_pattern | see definitions above | int | +| 15 | Ndelete | see definitions above | int | +| 16 | Nfail | see definitions above | int | +| 17 | Ndiff | see definitions above | int | +| 18 | Nnocall | see definitions above | int | ## Limitations diff --git a/book/src/intro_summary.md b/book/src/intro_summary.md index 9e2d9ff..d35ac5a 100644 --- a/book/src/intro_summary.md +++ b/book/src/intro_summary.md @@ -1,7 +1,7 @@ # Summarizing a modBAM. -The `modkit summary` sub-command is intended for collecting read-level statistics on -either a sample of reads, a region, or an entire modBam. +The `modkit summary` sub-command is intended for collecting read-level statistics on either a sample of reads, a region, or an entire modBam. +It is important to note that the default behavior of `modkit summary` is to take a sample of the reads to get a quick estimate. ## Summarize the base modification calls in a modBAM. diff --git a/docs/404.html b/docs/404.html index 14a1cc0..1640711 100644 --- a/docs/404.html +++ b/docs/404.html @@ -92,7 +92,7 @@