nf-core · christopher-hakkaart · Aug 5, 2024 · Aug 5, 2024 · Aug 5, 2024 · Aug 5, 2024
diff --git a/...docs/usage/Getting_started/terminology.md → ...rc/content/docs/guidelines/terminology.md b/...docs/usage/Getting_started/terminology.md → ...rc/content/docs/guidelines/terminology.md
@@ -1,12 +1,11 @@
 ---
-title: nf-core Terminology
-subtitle: Specification of the terms used in the nf-core community
-shortTitle: nf-core terminology
+title: Terminology
+subtitle: nf-core terminology
 ---
 
-The features offered by Nextflow DSL2 can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology we have decided to use when referring to DSL2 components.
+## Introduction
 
-## Terminology
+The features offered by Nextflow [DSL2](#domain-specific-language-dsl) can be used in various ways depending on the granularity with which you would like to write pipelines. Please see the listing below for the hierarchy and associated terminology nf-core uses when referring to DSL2 components.
 
 ### Domain-Specific Language (DSL)
 

diff --git a/sites/docs/src/content/docs/guides/configuration/introduction.md b/sites/docs/src/content/docs/guides/configuration/introduction.md
@@ -0,0 +1,174 @@
+---
+title: Configuration
+subtitle: How configure nf-core pipelines
+shortTitle: Configuration options
+weight: 1
+parentWeight: 20
+---
+
+## Configure nf-core pipelines
+
+Each nf-core pipeline comes with a set of “sensible defaults” for a "typical" analysis of a full size dataset.
+While the defaults are a great place to start, you will certainly want to modify these to fit your own data and system requirements. For example, modifying a tool flag of compute resources allocated for a process.
+
+When a pipeline is launched, Nextflow will look for config files in several locations.
+As each source can contain conflicting settings, the sources are ranked to decide which settings to apply.
+
+nf-core pipelines may utilize any of these configuration files.
+
+Configuration sources are reported below and listed in order of priority:
+
+1. Parameters specified on the command line (`--parameter`)
+2. Parameters that are provided using the `-params-file` option
+3. Config file that are provided using the `-c` option
+4. The config file named `nextflow.config` in the current directory
+5. The config file named `nextflow.config` in the pipeline project directory
+6. The config file `$HOME/.nextflow/config`
+7. Values defined within the pipeline script itself (e.g., `main.nf`)
+
+While some of these files are already included in the nf-core pipeline repository (e.g., the `nextflow.config` file in the nf-core pipeline repository), some are automatically identified on your local system (e.g., the `nextflow.config` in the launch directory), and others are only included if they are specified using run options (e.g., `-params-file`, and `-c`).
+
+:::warning
+You should not clone and manually edit an nf-core pipeline. Manually edited nf-core pipelines cannot be updated to more recent versions of the pipeline without overwriting your changes. You also risk moving away from the canonical pipeline and losing reproducibility.
+:::
+
+### Parameters
+
+Parameters are pipeline specific settings that can be used to customize the execution of a pipeline.
+
+At the highest level, parameters can be customized using the command line. Any parameter can be configured on the command line by prefixing the parameter name with a double dash (--):
+
+```bash
+--<parameter>
+```
+
+Depending on the parameter type, you may be required to add additional information after your parameter flag.
+For example, you would add string parameter after the parameter flag for the `nf-core/rnaseq` `--input` and `--output` parameters.
+
+```bash
+nextflow nf-core/rnaseq --input <path/to/input> --outdir <path/to/results>
+```
+
+Every nf-core pipeline has a full list of parameters on the nf-core website. You will be shown a description and the type of the parameter when viewing these parameters. Some parameters will also have additional text to help you understand how a parameter should be used. See the [parameters page of the nf-core rnaseq pipeline](https://nf-co.re/rnaseq/3.14.0/parameters/).
+
+### Default configuration files
+
+All parameters have a default configuration that is defined using the `nextflow.config` file in the root of the pipeline directory. Many parameters are set to `null` or `false` by default and are only activated by a profile or config file.
+
+nf-core pipelines also include additional config files from the `conf/` folder of a pipeline repository. Each additional `.config` file contains categorized configuration information for your pipeline execution, some of which can be optionally included as profiles:
+
+- `base.config`
+  - Included by the pipeline by default
+  - Generous resource allocations using labels
+  - Does not specify any method for software dependencies and expects software to be available (or specified elsewhere)
+- `igenomes.config`
+  - Included by the pipeline by default
+  - Default configuration to access reference files stored on AWS iGenomes
+- `modules.config`
+  - Included by the pipeline by default
+  - Module-specific configuration options (both mandatory and optional)
+- `test.config`
+  - Only included if specified as a profile
+  - A configuration profile to test the pipeline with a small test dataset
+- `test_full.config`
+  - Only included if specified as a profile
+  - A configuration profile to test the pipeline with a full-size test dataset
+
+:::note
+Some configuration files contain the definition of profiles that can be flexibly applied. For example, the `docker`, `singularity`, and `conda` profiles are defined in the `nextflow.config` file in the pipeline project directory. You should not need to manually edit any of these configuration files.
+:::
+
+Profiles are sets of configuration options that can be flexibly applied to a pipeline.
+They are also commonly defined in the `nextflow.config` file in the root of the pipeline directory.
+
+Profiles that come with nf-core pipelines can be broadly categorized into two groups:
+
+- Software management profiles
+  - Profiles for the management of software dependencies using container or environment management tools, for example, `docker`, `singularity`, and `conda`.
+- Test profiles
+  - Profiles to execute the pipeline with a standardized set of test data and parameters, for example, `test` and `test_full`.
+
+nf-core pipelines are required to define software containers and environments that can be activated using profiles. Although it is possible to run the pipelines with software installed by other methods (e.g., environment modules or manual installation), using container technology is more sharable, convenient, and reproducible.
+
+### Shared configuration files
+
+nf-core pipelines can also load custom institutional profiles that have been submitted to the [nf-core config repository](https://github.com/nf-core/configs). At run time, nf-core pipelines will fetch these configuration profiles from the [nf-core config repository](https://github.com/nf-core/configs) and make them available.
+
+For shared resources such as an HPC cluster, you may consider developing a shared institutional profile.
+
+Follow [this tutorial](https://nf-co.re/docs/usage/tutorials/step_by_step_institutional_profile) to set up your own institutional profile.
+
+### Custom parameter and configuration files
+
+Nextflow will look for files that are external to the pipeline project directory. These files include:
+
+- The config file `$HOME/.nextflow/config`
+- A config file named `nextflow.config` in your current directory
+- Custom configuration files specified using the command line
+  - A parameter file that is provided using the `-params-file` option
+  - A config file that are provided using the `-c` option
+
+**Parameter file format**
+
+Parameter files are `.json` files that can contain an unlimited number of parameters:
+
+```json title="nf-params.json"
+{
+  "<parameter1_name>": 1,
+  "<parameter2_name>": "<string>",
+  "<parameter3_name>": true
+}
+```
+
+You can override default parameters by creating a `.json` file and passing it as a command-line argument using the `-param-file` option.
+
+```bash
+nextflow run nf-core/rnaseq -profile docker --input <path/to/input? --outdir <results> -param-file <path/to/nf-params.json>
+```
+
+**Configuration file format**
+
+Configuration files are `.config` files that can contain various pipeline properties and can be passed to Nextflow using the `-c` option in your execution command:
+
+```bash
+nextflow run nf-core/rnaseq  -profile docker --input <path/to/input> --outdir <results> -c <path/to/custom.config>
+```
+
+Custom configuration files are the same format as the configuration file included in the pipeline directory.
+
+Configuration properties are organized into [scopes](https://www.nextflow.io/docs/latest/config.html#config-scopes) by dot prefixing the property names with a scope identifier or grouping the properties in the same scope using the curly brackets notation. For example:
+
+```groovy
+alpha.x  = 1
+alpha.y  = 'string value'
+```
+
+Is equivalent to:
+
+```groovy
+alpha {
+    x = 1
+    y = 'string value'
+}
+```
+
+[Scopes](https://www.nextflow.io/docs/latest/config.html#config-scopes) allow you to quickly configure settings required to deploy a pipeline on different infrastructure using different software management.
+
+A common scenario is for users to write a custom configuration file specific to running a pipeline on their infrastructure.
+
+:::warning
+Do not use `-c <file>` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for tuning process resource specifications, other infrastructural tweaks (such as output directories), or module arguments (`args`).
+:::
+
+Multiple scopes can be included in the same `.config` file using a mix of dot prefixes and curly brackets.
+
+```groovy
+executor.name = "sge"
+
+singularity {
+    enabled    = true
+    autoMounts = true
+}
+```
+
+See the [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html#config-scopes) for a full list of scopes.
diff --git a/sites/docs/src/content/docs/guides/configuration/modify_tools.md b/sites/docs/src/content/docs/guides/configuration/modify_tools.md
@@ -0,0 +1,125 @@
+---
+title: Modifying pipelines
+subtitle: Configure tool containers and arguments
+shortTitle: Modifying pipelines
+weight: 3
+---
+
+## Modifying tools
+
+Each tool in an nf-core pipeline come preconfigured with a set arguments for an average user.
+The arguments are a great place to start and have been tested as a part of the development process.
+You normally can change the default settings using parameters using the double dash notation, e.g., `--input`.
+However, you may want to modify these to fit your own purposes.
+
+It is **very unlikely** that you will need to edit the pipeline code to configure a tool.
+
+### Tool arguments
+
+You may wish to understand which tool arguments a pipeline uses, update, or add additional arguments not currently supported by a pipeline.
+
+You can sometimes find out what parameters are used by a tool in by checking the longer 'help' description of different pipeline parameters, e.g., by pressing the 'help' button next to [this parameter](https://nf-co.re/funcscan/1.0.1/parameters#annotation_bakta_mincontig) in [nf-core/funcscan](https://nf-co.re/funcscan).
+
+There are two main places that a tool can have a tool argument specified:
+
+- The process `script` block
+- The `conf/modules.conf` file
+
+Most arguments (both mandatory or optional) are defined in the `conf/modules.conf` file under the `ext.args` entry. Arguments that are defined in the `conf/modules.conf` file can be flexible modified using custom configuration files.
+
+Arguments specified in `ext.args` are then inserted into the module itself via the `$args` variable in the module's bash code
+
+For example, the `-n` parameter could be added to the `BOWTIE_BUILD` process:
+
+```groovy
+process {
+    withName: BOWTIE_BUILD {
+        ext.args = "-n 0.1"
+    }
+```
+
+Updated tools may come with major changes and may break a pipeline and/or create missing values in MultiQC version tables.
+
+:::warning
+Such changes come with no warranty or support by the the pipeline developers!
+:::
+
+### Changing tool versions
+
+You can tell the pipeline to use a different container image within a config file and the `process` scope.
+
+You then need to identify the `process` name and override the Nextflow `container` or `conda` definition using the `withName` process selector.
+
+For example, the [nf-core/viralrecon](https://nf-co.re/viralrecon) pipeline uses a tool called Pangolin that updates an internal database of COVID-19 lineages quite frequently.
+
+To update the container specification, you can do the following steps:
+
+1. Check the default version used by the pipeline in the module file for the tool under `modules/nf-core/` directory of the pipeline. For example, for [Pangolin](https://github.com/nf-core/viralrecon/blob/a85d5969f9025409e3618d6c280ef15ce417df65/modules/nf-core/software/pangolin/main.nf#L14-L19)
+2. Find the latest version of the Biocontainer available on [quay.io](https://quay.io/repository/biocontainers/pangolin?tag=latest&tab=tags) for Docker or [Galaxy Project](https://depot.galaxyproject.org/singularity/) for Singularity
+   - Note the container version tag is identical for both container systems, but must include the 'build' ID (e.g.`--pyhdfd78af_1`)
+3. Create the custom config accordingly:
+
+   - For Docker:
+
+     ```groovy
+     process {
+         withName: PANGOLIN {
+             container = 'quay.io/biocontainers/pangolin:3.1.17--pyhdfd78af_1'
+         }
+     }
+     ```
+
+   - For Singularity:
+
+     ```groovy
+     process {
+         withName: PANGOLIN {
+             container = 'https://depot.galaxyproject.org/singularity/pangolin:3.1.17--pyhdfd78af_1'
+         }
+     }
+     ```
+
+   - For Conda:
+
+     ```groovy
+     process {
+         withName: PANGOLIN {
+             conda = 'bioconda::pangolin=3.1.17'
+         }
+     }
+     ```
+
+:::warning
+Updated tools may come with major changes and may break a pipeline and/or create missing values in MultiQC version tables. Such changes come with no warranty or support by the the pipeline developers.
+:::
+
+### Docker registries
+
+nf-core pipelines use `quay.io` as the default docker registry for Docker and Podman images.
+When specifying a Docker container, it will pull the image from `quay.io` unless a full URI is specified. For example, if the process container is:
+
+```bash
+biocontainers/fastqc:0.11.7--4
+```
+
+The image will be pulled from quay.io by default, resulting in a full URI of:
+
+```bash
+quay.io/biocontainers/fastqc:0.11.7--4
+```
+
+If `docker.registry` is specified, it will be used first. For example, if the config value `docker.registry = 'public.ecr.aws'` is specified the image will be pulled from:
+
+```bash
+public.ecr.aws/biocontainers/fastqc:0.11.7--4
+```
+
+However, the `docker.registry` setting will be ignored if you specify a full URI:
+
+```bash
+docker.io/biocontainers/fastqc:v0.11.9_cv8
+```
+
+:::warning
+Updated registries may come with unexpected changes and come with no warranty or support by the the pipeline developers.
+:::
diff --git a/sites/docs/src/content/docs/guides/configuration/running_offline.md b/sites/docs/src/content/docs/guides/configuration/running_offline.md
@@ -0,0 +1,72 @@
+---
+title: Running offline
+subtitle: Run nf-core pipelines offline
+shortTitle: Running offline
+weight: 4
+---
+
+## Running offline
+
+When Nextflow is connected to the internet it will fetch nearly everything it needs to run a pipeline. Nextflow can also run analysis on an offline system that has no internet connection. However, there are a few extra steps that are required to get everything you will need locally.
+
+To run a pipeline offline you will need three things:
+
+- [Nextflow](#nextflow)
+- [Pipeline assets](#pipeline-assets)
+- [Reference genomes](#reference-genomes) _(if required)_
+
+These will first need to be fetched on a system that _does_ have an internet connection and transferred to your offline system.
+
+### Nextflow
+
+You need to have Nextflow installed on your local system.
+You can do this by installing Nextflow on a machine that _does_ have an internet connection and transferring to the offline system:
+
+1. [Install Nextflow locally](/docs/usage/quick_start/installation.md)
+   :::warning
+   Do _not_ use the `-all` package, as this does not allow the use of custom plugins.
+   :::
+2. Run a Nextflow pipeline locally so that Nextflow fetches the required plugins.
+   - It does not need to run to completion.
+3. Copy the Nextflow executable and your `$HOME/.nextflow` folder to your offline environment
+4. Specify the name and version each plugin that you downloaded in a local Nextflow configuration file
+   - This will prevent Nextflow from trying to download newer versions of plugins.
+5. Set `export NXF_OFFLINE='true'` in your terminal
+   - To set this permanently, add this command to your shell configuration file (e.g., `~/.bashrc` or `~/.zshrc`)
+
+### Pipeline assets
+
+To run a pipeline offline, you next need the pipeline code, the software dependencies, and the shared nf-core/configs profiles.
+We have created a helper tool as part of the _nf-core_ package to automate this for you.
+
+On a computer with an internet connection, run `nf-core download <pipeline>` to download the pipeline and config profiles.
+Add the argument `--container singularity` to also fetch the singularity container(s). Note that only singularity is supported.
+
+The pipeline and requirements will be downloaded, configured with their relative paths, and packaged into a `.tar.gz` file by default.
+This can then be transferred to your offline system and unpacked.
+
+Inside, you will see directories called `workflow` (the pipeline files), `config` (a copy of [nf-core/configs](https://github.com/nf-core/configs)), and (if you used `--container singularity`) a directory called `singularity`.
+The pipeline code is adjusted by the download tool to expect these relative paths, so as long as you keep them together it should work as is.
+
+### Shared storage
+
+If you are downloading _directly_ to the offline storage (e.g., a head node with internet access whilst compute nodes are offline), you can use the `--singularity-cache-only` option for `nf-core download` and set the `$NXF_SINGULARITY_CACHEDIR` environment variable.
+This downloads the singularity images to the `$NXF_SINGULARITY_CACHEDIR` folder and does not copy them into the target downloaded pipeline folder.
+This reduces total disk space usage and is faster.
+
+See the [documentation for `nf-core download`](/docs/nf-core-tools/pipelines/download) for more information.
+
+### Reference genomes
+
+Some pipelines require reference genomes and have built-in integration with AWS-iGenomes.
+If you wish to use these references, you must also download and transfer them to your offline system.
+
+Follow the [reference genomes documentation](/docs/usage/reference_genomes/reference_genomes.md) to configure the base path for the references.
+
+### Bytesize talk
+
+Here is a recent bytesize talk explaining the necessary steps to run pipelines offline.
+
+<!-- markdownlint-disable -->
+<iframe width="560" height="315" src="https://www.youtube.com/embed/N1rRr4J0Lps" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+<!-- markdownlint-restore -->