Skip to content

Commit

Permalink
Merge branch 'ar/prep-026-release' into 'master'
Browse files Browse the repository at this point in the history
[release] Update docs and changelog for 0.2.6 release

See merge request machine-learning/modkit!158
  • Loading branch information
ArtRand committed Mar 15, 2024
2 parents d31a61a + 7a229d8 commit 7f6dd3a
Show file tree
Hide file tree
Showing 7 changed files with 52 additions and 6 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,14 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [v0.2.6]
### Fixes
- [dmr, single-site] Don't require that there are equal numbers of samples for single site DMR with multiple samples. Fixes #140.
- [dmr, pairwise, region] Protect when zero bedmethyl records are found for a region, fixes #146.
### Adds
- [validate] Adds on-the-fly filtering of reads by alignment identity and/or alignment length.


## [v0.2.5]
### Fixes
- [extract] Only emit mapped reads when `--region` is provided, but still emit unmapped bases in those reads unless `--mapped-only` is passed.
Expand Down
6 changes: 6 additions & 0 deletions book/src/advanced_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -956,6 +956,12 @@ Options:
[possible values: A, C, G, T]
--min-identity <MIN_ALIGNMENT_IDENTITY>
Only use reads with alignment identity >= this number, in Q-space (phred score).
--min-length <MIN_ALIGNMENT_LENGTH>
Remove reads with fewer aligned reference bases than this threshold.
-q, --filter-quantile <FILTER_QUANTILE>
Filter out modified base calls where the probability of the predicted variant is below
this confidence percentile. For example, 0.1 will filter out the 10% lowest confidence
Expand Down
6 changes: 6 additions & 0 deletions docs/advanced_usage.html
Original file line number Diff line number Diff line change
Expand Up @@ -1111,6 +1111,12 @@ <h2 id="validate"><a class="header" href="#validate">validate</a></h2>

[possible values: A, C, G, T]

--min-identity &lt;MIN_ALIGNMENT_IDENTITY&gt;
Only use reads with alignment identity &gt;= this number, in Q-space (phred score).

--min-length &lt;MIN_ALIGNMENT_LENGTH&gt;
Remove reads with fewer aligned reference bases than this threshold.

-q, --filter-quantile &lt;FILTER_QUANTILE&gt;
Filter out modified base calls where the probability of the predicted variant is below
this confidence percentile. For example, 0.1 will filter out the 10% lowest confidence
Expand Down
14 changes: 12 additions & 2 deletions docs/intro_dmr.html
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,8 @@ <h2 id="differential-methylation-output-format"><a class="header" href="#differe
<tr><td>19</td><td>per-replicate effect sizes</td><td>effect sizes matched replicate pairs</td><td>float</td></tr>
</tbody></table>
</div>
<p>Columns 16-19 are only produced when multiple replicates are provided. Columns 18 and 19 have the replicate pairwise MAP-based p-values and effect sizes which are calculated based on their order provided on the command line.
<p>Columns 16-19 are only produced when an equal number of replicates are provided.
Columns 18 and 19 have the replicate pairwise MAP-based p-values and effect sizes which are calculated based on their order provided on the command line.
For example in the abbreviated command below:</p>
<pre><code class="language-bash">modkit dmr pair \
-a ${norm_pileup_1}.gz \
Expand All @@ -356,7 +357,16 @@ <h2 id="differential-methylation-output-format"><a class="header" href="#differe
...
</code></pre>
<p>Column 18 will contain the MAP-based p-value comparing <code>norm_pileup_1</code> versus <code>tumor_pileup_1</code> and <code>norm_pileup_2</code> versus <code>norm_pileup_2</code>.
Column 19 will contain the effect sizes, values are comma-separated.</p>
Column 19 will contain the effect sizes, values are comma-separated.
If you have a different number of samples for each condition, such as:</p>
<pre><code class="language-bash">modkit dmr pair \
-a ${norm_pileup_1}.gz \
-a ${norm_pileup_2}.gz \
-a ${norm_pileup_3}.gz \
-b ${tumor_pileup_1}.gz \
-b ${tumor_pileup_2}.gz \
</code></pre>
<p>these columns will not be present.</p>

</main>

Expand Down
20 changes: 18 additions & 2 deletions docs/print.html
Original file line number Diff line number Diff line change
Expand Up @@ -938,7 +938,8 @@ <h2 id="differential-methylation-output-format"><a class="header" href="#differe
<tr><td>19</td><td>per-replicate effect sizes</td><td>effect sizes matched replicate pairs</td><td>float</td></tr>
</tbody></table>
</div>
<p>Columns 16-19 are only produced when multiple replicates are provided. Columns 18 and 19 have the replicate pairwise MAP-based p-values and effect sizes which are calculated based on their order provided on the command line.
<p>Columns 16-19 are only produced when an equal number of replicates are provided.
Columns 18 and 19 have the replicate pairwise MAP-based p-values and effect sizes which are calculated based on their order provided on the command line.
For example in the abbreviated command below:</p>
<pre><code class="language-bash">modkit dmr pair \
-a ${norm_pileup_1}.gz \
Expand All @@ -948,7 +949,16 @@ <h2 id="differential-methylation-output-format"><a class="header" href="#differe
...
</code></pre>
<p>Column 18 will contain the MAP-based p-value comparing <code>norm_pileup_1</code> versus <code>tumor_pileup_1</code> and <code>norm_pileup_2</code> versus <code>norm_pileup_2</code>.
Column 19 will contain the effect sizes, values are comma-separated.</p>
Column 19 will contain the effect sizes, values are comma-separated.
If you have a different number of samples for each condition, such as:</p>
<pre><code class="language-bash">modkit dmr pair \
-a ${norm_pileup_1}.gz \
-a ${norm_pileup_2}.gz \
-a ${norm_pileup_3}.gz \
-b ${tumor_pileup_1}.gz \
-b ${tumor_pileup_2}.gz \
</code></pre>
<p>these columns will not be present.</p>
<div style="break-before: page; page-break-before: always;"></div><h1 id="validating-ground-truth-results"><a class="header" href="#validating-ground-truth-results">Validating ground truth results.</a></h1>
<p>The <code>modkit validate</code> sub-command is intended for validating results in a uniform manner from samples with known modified base content. Specifically the modified base status at any annotated reference location should be known.</p>
<h2 id="validating-from-modbam-reads-and-bed-reference-annotation"><a class="header" href="#validating-from-modbam-reads-and-bed-reference-annotation">Validating from modBAM reads and BED reference annotation.</a></h2>
Expand Down Expand Up @@ -1959,6 +1969,12 @@ <h2 id="validate"><a class="header" href="#validate">validate</a></h2>

[possible values: A, C, G, T]

--min-identity &lt;MIN_ALIGNMENT_IDENTITY&gt;
Only use reads with alignment identity &gt;= this number, in Q-space (phred score).

--min-length &lt;MIN_ALIGNMENT_LENGTH&gt;
Remove reads with fewer aligned reference bases than this threshold.

-q, --filter-quantile &lt;FILTER_QUANTILE&gt;
Filter out modified base calls where the probability of the predicted variant is below
this confidence percentile. For example, 0.1 will filter out the 10% lowest confidence
Expand Down
2 changes: 1 addition & 1 deletion docs/searchindex.js

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/searchindex.json

Large diffs are not rendered by default.

0 comments on commit 7f6dd3a

Please sign in to comment.