q2-kmerizer

A QIIME 2 plugin for generating and working with kmers from biological sequence information.

Note: this plugin is under active development during pre-release. The code should not be considered stable or ready for publication-ready analyses.

Installation instructions

The easiest way to install q2-kmerizer is to install it directly into an existing installation of QIIME 2 (amplicon distribution version 2024.10 or later). If you have the QIIME 2 amplicon distribution installed, activate your environment and run the following to install q2-kmerizer into this environment:

pip install q2_kmerizer@git+https://github.com/bokulich-lab/q2-kmerizer.git@main

And refresh your cache:

qiime dev refresh-cache

If the installation worked correctly, the following command should display a description of the plugin in your terminal:

qiime kmerizer --help

Installation of stable release

If you do not already have QIIME 2 installed, you can follow these instructions to install the QIIME 2 amplicon distribution as well as the latest stable version of q2-kmerizer.

Miniconda provides the conda environment and package manager, and is currently the only supported way to install QIIME 2. Follow the instructions for downloading and installing Miniconda.

After installing Miniconda and opening a new terminal, make sure you're running the latest version of conda:

conda update conda

Now ue conda to install q2-kmerizer and QIIME 2:

conda env create -n kmerizer-stable --file https://raw.githubusercontent.com/bokulich-lab/q2-kmerizer/main/environments/q2-kmerizer-qiime2-amplicon-2024.10.yml

After this completes, activate the new environment you created by running:

conda activate kmerizer-stable

Then refresh your cache and test as shown above.

Install development version of `q2-kmerizer`

If you wish to use the development version of q2-kmerizer, e.g., to develop new features in your fork or to contribute to the main branch, follow these instructions.

First, you must have conda installed, as described above.

Next, clone the repository and move into the top-level q2-kmerizer directory. NOTE: make sure your current working directory is a location where you want to install this plugin!

git clone https://github.com/bokulich-lab/q2-kmerizer.git
cd q2-kmerizer

Then, run:

conda env create -n q2-kmerizer-dev --file ./environments/q2-kmerizer-qiime2-amplicon-2024.10.yml

After this completes, activate the new environment you created by running:

conda activate q2-kmerizer-dev

Finally, run:

make install

Then refresh your cache and test as shown above.

Examples

As an example test, we will use data from Sampson et al, 2016, a study testing whether the fecal microbiome contributed to the development of Parkinson’s Disease (PD).

First we will download the test data:

wget https://data.qiime2.org/2024.10/tutorials/pd-mice/sample_metadata.tsv
wget https://docs.qiime2.org/2024.10/data/tutorials/pd-mice/dada2_table.qza
wget https://docs.qiime2.org/2024.10/data/tutorials/pd-mice/dada2_rep_set.qza

We can count kmer frequencies per sample with this command:

qiime kmerizer seqs-to-kmers \
    --i-sequences dada2_rep_set.qza \
    --i-table dada2_table.qza \
    --o-kmer-table kmer_table.qza \
    --p-max-features 5000

Or run this pipeline to count kmer frequencies, calculate diversity metrics, and create an interactive scatterplot with the results:

qiime kmerizer core-metrics \
    --i-sequences dada2_rep_set.qza \
    --i-table dada2_table.qza \
    --p-sampling-depth 1000 \
    --m-metadata-file sample_metadata.tsv \
    --p-color-by-group donor \
    --p-max-features 5000 \
    --output-dir core-metrics/

Both of these actions output a frequency table that contains kmer counts per sample. This can be used like any other frequency table and passed to any action in QIIME 2 that accepts a frequency table (except for those that also require additional inputs that must match the features in the table, e.g., that require a taxonomy). For example, we can run a pipeline to train a Random Forest classifier and test on a hold-out subset of the dataset (note: this analysis is done purely for demonstrative purposes; the sample size in this test dataset is much smaller than would be required for a robust supervised learning analysis, and proper replicate handling should be done to avoid data leakage).

qiime sample-classifier classify-samples \
    --i-table kmer_table.qza \
    --m-metadata-file sample_metadata.tsv \
    --m-metadata-column donor \
    --output-dir sample-classifier/

About

The q2-kmerizer Python package was created from a template. To learn more about q2-kmerizer, refer to the project website. To learn how to use QIIME 2, refer to the QIIME 2 User Documentation. To learn QIIME 2 plugin development, refer to Developing with QIIME 2.

q2-kmerizer is a QIIME 2 plugin. For questions, comments, or feature requests about this plugin, please post in the Community Plugins category on the QIIME 2 Forum. The issue tracker on the GitHub repository is intended for use by the plugin developers and maintainers, not as a help forum.

Citation

If you use q2-kmerizer in your work, please cite the following article:

Bokulich, N.A. 2024. Integrating sequence composition information into microbial diversity analyses with k-mer frequency counting. bioRxiv 2024.08.13.607770; doi: https://doi.org/10.1101/2024.08.13.607770

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
environments		environments
q2_kmerizer		q2_kmerizer
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py
versioneer.py		versioneer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

q2-kmerizer

Installation instructions

Installation of stable release

Install development version of `q2-kmerizer`

Examples

About

Citation

About

Releases

Packages

Languages

License

bokulich-lab/q2-kmerizer

Folders and files

Latest commit

History

Repository files navigation

q2-kmerizer

Installation instructions

Installation of stable release

Install development version of q2-kmerizer

Examples

About

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Install development version of `q2-kmerizer`

Packages