Skip to content

Latest commit






APAlyzer utilizes the PAS (polyadenylation sites) collection in the PolyA_DB database to examine APA (alternative polyadenylation) events in all genomic regions, including 3′UTRs and introns.


APAlyzer method workflow run instructions

Input file

Required files are to be specified in the input config/samples.csv. Each row in the sample sheet has two columns:

  • condition: name of the condition (e.g control)
  • sample: name of the sample (e.g. control_replicate1)
  • bam: relative path from APAlyzer working directory / absolute path to the BAM input file for the sample

It is important to name samples of the same condition with the exact condition name under the condition column since samples are grouped per condition to be processed by APAlyzer.

Setting parameters in the config file

Parameters used to run APAlyzer are specified in config/config.APAlyzer.yaml. In the config file, users are able to specify the output directory and output file name: out_dir, differential_output_file.

In addition, the relative path from the working directory to the input sample file from the previous step is to be specified with parameter sample_file.

Other parameters that are important to specify for each run are the path to GTF annotation file and GTF annotation file organism, genome version, and ensemble version details: gtf, gtf_organism, gtf_genome_version, gtf_ensemble_version.

Setting up the environment

To run the method workflow, we first need to activate apaeval conda environment following the instructions on APAeval README.

Running the workflow

Before running, you can perform a 'dry run' to check which steps will be run and where output files will be generated given the provided parameters and input sample file:


To run the workflow locally, you can use the provided wrapper script which executes with singularity.


Note: The script is currently set up to run with the APAeval test data. If you have specified absolute paths in your sample sheet (e.g. config/samples.csv) or the config file (config/config.DaPars2.yaml), or have input data that is not in the current directory, you will need to modify Singularity bind arguments so the input files will be available to the container.

e.g. The path to the input GTF file is /share/annotation/annotation.gtf, and my current working directory is /home/sam/DaPars2_snakemake/. Modify the --singularity-args line in like below to ensure the file is available to the container:

--sigularity-args="--bind /share/" \

If you are satisfied with the bind arguments, you can run the workflow locally by doing bash

Output & post-processing

The output of APAlyzer qualifies for differential challenge. The file is postprocessed into a tsv file consisting of a column of gene ids and another column of pvalues located in out_dir that is specified in the config file config/config.APAlyzer.yaml.


The rulegraph gives an overview of the steps of the workflow. To obtain it, adapt and run the script. The current rulegraph is:


Author contact

If you have any question or comment about APAlyzer, please contact Dr. Ruijia Wang ([email protected]).