kallisto task for RNA-seq

This task provides a convenience wrapper around kallisto. It can compute gene quantifications from reads in FASTQ format or build a kallisto index from transcript sequences in FASTA format.

Running

We encourage running this task as a Docker image, which is publicly available through GitHub packages. To pull the image, first install Docker, then run

docker pull docker.pkg.github.com/krews-community/rnaseq-kallisto-task/rnaseq-kallisto:latest

The task requires a pre-built kallisto index to be provided. A kallisto index can be built from one or more FASTA files with gene or transcript sequences by running this task:

docker run \
    --volume /path/to/inputs:/input \
    --volume /path/to/outputs:/output \
    docker.pkg.github.com/krews-community/rnaseq-kallisto-task/rnaseq-kallisto:latest \
    java -jar /app/kallisto.jar index \
        --sequence /input/seq1.fa --sequence /input/seq2.fa \
        --output /output/index.idx

Then, to perform quantifications, simply run:

docker run \
    --volume /path/to/inputs:/input \
    --volume /path/to/outputs:/output \
    docker.pkg.github.com/krews-community/rnaseq-kallisto-task/rnaseq-kallisto:latest \
    java -jar /app/kallisto.jar quant \
        --r1 /input/r1.fastq.gz --r2 /input/r2.fastq.gz \
        --index /input/index.idx --output-directory /output

This will produce an output directory containing quantifications (output.abundance.tsv).

Parameters

Indexing:

name	description	default
sequence	path to input FASTA sequences; many can be provided	(required)
output	path to output file; if it exists, it will be overwritten	(required)
kmer-size	the kmer size to use	31
make-unique	if set, sequences with duplicate names are made unique	false

Quantification:

name	description	default
r1	path to single-end or pair 1 reads in FASTQ format	(required)
r2	path to pair 2 reads in FASTQ format	(none)
output-directory	directory in which to write output	(required)
strandedness	set if reads are from one strand; forward, reverse, or unstranded	unstranded
output-prefix	prefix to use when naming output files	output
fragment-length	fragment length, required for single-end reads	(none)
sd-fragment-length	SD of fragment length, required for single-end reads	(none)
cores	number of cores available to the task	1
ram-gb	gigabytes of RAM available to the task	16

For developers

The task provides integrated unit testing, which quantifies a small number of reads against a human mitochondrial index and checks that the outputs match expected values. To run the tests, first install Docker and Java, then clone this repo, then run

scripts/test.sh

Contributions to the code are welcome via pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
gradle/wrapper		gradle/wrapper
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
build.gradle.kts		build.gradle.kts
gradlew		gradlew
gradlew.bat		gradlew.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kallisto task for RNA-seq

Running

Parameters

For developers

About

Releases 2

Packages

Contributors 2

Languages

weng-lab/rnaseq-kallisto-task

Folders and files

Latest commit

History

Repository files navigation

kallisto task for RNA-seq

Running

Parameters

For developers

About

Topics

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages