Skip to content

Commit

Permalink
Merge pull request #38 from clinical-genomics-uppsala/develop
Browse files Browse the repository at this point in the history
refactor: remove SampleSheet and misc for "new" stackstorm
  • Loading branch information
elleira authored Apr 24, 2024
2 parents e4ca12c + ce19c63 commit d45b5f1
Show file tree
Hide file tree
Showing 17 changed files with 390 additions and 403 deletions.
Empty file removed .tests/integration/SampleSheet.csv
Empty file.
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,17 +32,22 @@ The workflow repository contains a small test dataset (:exclamation: Todo: as of

```bash
$ cd .tests/integration
$ snakemake -n -s ../../workflow/Snakefile --configfiles ../../config/config.yaml config.yaml --config sequenceid="990909_test"
$ snakemake -n -s ../../workflow/Snakefile --configfiles ../../config/config.yaml config.yaml --config sequenceid="990909_test" PATH_TO_REPO=/folder/containing/marple_rd_tc/
```
> **_NOTE:_** If using the variable `PATH_TO_REPO` in the config-file this need to be defined in the commandline

## :rocket: [Usage](https://marple-rd-tc.readthedocs.io/en/latest/running/)

To use this run this pipeline `sample.tsv`, `units.tsv`, `resources.yaml`, and `config.yaml` files need to be available in the current directory (or otherwise specified in `config.yaml`). You always need to specify the `config`-file and `sequenceid` variable in the command. To run the pipeline:

```bash
$ snakemake --profile snakemakeprofile --configfile config.yaml --config sequenceid="990909_test" -s /path/to/marple_rd_tc/workflow/Snakefile
$ snakemake --profile snakemakeprofile --configfile config.yaml --config sequenceid="990909_test" -s /path/to/marple_rd_tc/workflow/Snakefile --config PATH_TO_REPO=/folder/containing/marple_rd_tc/
```

> **_NOTE:_** If using the variable `PATH_TO_REPO` in the config this need to be defined in the commandline

## :books: [Output files](https://marple-rd-tc.readthedocs.io/en/latest/result_files/)

The following output files are located in `Results/`-folder:
Expand Down
7 changes: 2 additions & 5 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
resources: "resources.yaml"
samples: "samples.tsv"
units: "units.tsv"
output: "/projects/wp3/nobackup/TwistCancer/Bin/marple_rd_tc/config/output_files.yaml"
output: "{{PATH_TO_REPO}}/marple_rd_tc/config/output_files.yaml"

default_container: "docker://hydragenetics/common:1.8.1"

Expand Down Expand Up @@ -67,7 +67,7 @@ multiqc:
reports:
DNA:
included_unit_types: ["T", "N"]
config: "/projects/wp3/nobackup/TwistCancer/Bin/marple_rd_tc/config/multiqc_config.yaml"
config: "{{PATH_TO_REPO}}/marple_rd_tc/config/multiqc_config.yaml"
qc_files:
- "prealignment/fastp_pe/{sample}_{type}_{flowcell}_{lane}_{barcode}_fastp.json"
- "qc/fastqc/{sample}_{type}_{flowcell}_{lane}_{barcode}_{read}_fastqc.zip"
Expand Down Expand Up @@ -119,9 +119,6 @@ picard_collect_multiple_metrics:
picard_mark_duplicates:
container: "docker://hydragenetics/picard:2.25.4"

sample_order_multiqc:
sample_sheet: "SampleSheet.csv"

vep:
container: "docker://ensemblorg/ensembl-vep:release_109.3" # "docker://hydragenetics/vep:109"
vep_cache: "/data/ref_genomes/VEP"
Expand Down
2 changes: 0 additions & 2 deletions docs/includes/images/qc.dot
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,9 @@ digraph snakemake_dag {
p_align[label = "qc_picard_collect_alignment_summary_metrics", color = "0.29 0.6 0.85", style="rounded"];
p_dup[label = "qc_picard_collect_duplication_metrics", color = "0.34 0.6 0.85", style="rounded"];
sampleorder[label = "sample_order_multiqc", color = "0.00 0.6 0.85", style="rounded"];
samplesheet[label = "SampleSheet.csv", color = "0.0 0.0 0.0", style="dotted"];

multiqc -> multiqc_html
sampleorder -> multiqc
samplesheet -> sampleorder
fastp -> multiqc
fastp -> bam [style="dotted", label = "alignment", fontcolor = "grey50", fontsize=9, fontname=sans ]
p_gc -> multiqc
Expand Down
Binary file modified docs/includes/images/qc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/result_files.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ The report is configured based on a MultiQC config file.
///

### General Statistics
The general statistics table are ordered based on the sample order in `SampleSheet.csv`, this is done by renaming the samples in two steps using the script `sample_order_multiqc.py`. To toggle between "Sample Order" and "Sample Name" use the buttons just above General Stats header.
The general statistics table are ordered based on the fastq-file "S"-index, e.g. `sampleT_S1_R1_001.fastq.gz` will be before `sampleA_S2_R1_001.fastq.gz`. This is done by renaming the samples in two steps using the script `sample_order_multiqc.py`. To toggle between "Sample Order" and "Sample Name" use the buttons just above General Stats header.

<br />

Expand Down
5 changes: 2 additions & 3 deletions docs/running.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ git clone --branch ${VERSION} https://github.com/clinical-genomics-uppsala/marpl
To run the Marple pipeline a python virtual environment is needed. Create a virtual environment and then install pipeline requirements specified in `requirements.txt`.
```bash
# Create a new virtual environment
python3 -m venv ${WORKING_DIRECTORY}/virtual/environment
python3.9 -m venv ${WORKING_DIRECTORY}/virtual/environment

# Enter working directory
cd ${WORKING_DIRECTORY}
Expand Down Expand Up @@ -88,7 +88,6 @@ An `resources.yaml` file can also be found in the `config/`-folder. This is adap
source virtual/environment/bin/activate

# Run snakemake command with the extra config parameter called sequenceid
snakemake --profile snakemakeprofile --configfile config.yaml --config sequenceid="230202-test" -s /path/to/marple/workflow/Snakefile

snakemake --profile snakemakeprofile --configfile config.yaml --config sequenceid="230202-test" -s /path/to/marple/workflow/Snakefile --config PATH_TO_REPO=/path/to/repo/
```

5 changes: 4 additions & 1 deletion docs/running_ref.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,12 @@ To generate `.bam` **and** `.bai`-files for all samples you need to run Marple u

```bash
# Run snakemake command with the extra config parameter called sequenceid
snakemake --profile snakemakeprofile --configfile config.yaml --config sequenceid="normal_samples" -s /path/to/marple/workflow/Snakefile --no-temp --until qc_mosdepth_bed
snakemake --profile snakemakeprofile --configfile config.yaml --config sequenceid="normal_samples" -s /path/to/marple/workflow/Snakefile --no-temp --until qc_mosdepth_bed --config PATH_TO_REPO=/folder/containing/marple_rd_tc/

```

> **_NOTE:_** If using the variable `PATH_TO_REPO` (folder containing `marple_rd_tc`) in the config-file this need to be defined in the commandline
### :books: Input files
Four different files need to be available in your runfolder and to be adapted to your compute-environment and sequence run; `samples.tsv`, `units_references.tsv`, `config_references.yaml` and `resources.yaml`.
#### Samples and Units
Expand Down
2 changes: 1 addition & 1 deletion docs/softwares.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ Rules that creates a `.xlsx` file per sample with aggregated coverage informatio
---

## sample_order_multiqc.smk
A python script to create sample_replacement and sample_order files to be used in MultiQC to order samples based on order in SampleSheet.csv
A python script to create sample_replacement and sample_order files to be used in MultiQC to order samples based on order of the "S"-index in the samplenames.

### :snake: Rule

Expand Down
Loading

0 comments on commit d45b5f1

Please sign in to comment.