A lightweight library for working with PAF (Pairwise mApping Format) files.
Documentation: https://mbh.sh/pafpy
pip install pafpy
conda install -c bioconda pafpy
If you would like to install locally, the recommended way is using poetry.
git clone https://github.com/mbhall88/pafpy.git
cd pafpy
make install
# to check the library is installed run
poetry run python -c "from pafpy import PafRecord;print(str(PafRecord()))"
# you should see a (unmapped) PAF record printed to the terminal
# you can also run the tests if you like
make test-code
For full usage, please refer to the documentation. If there is any functionality
you feel is missing or would make pafpy
more user-friendly, please raise an issue with
a feature request.
In the below basic usage pattern, we collect the BLAST identity of all primary alignments in our PAF file into a list.
from typing import List
from pafpy import PafFile
path = "path/to/sample.paf"
identities: List[float] = []
with PafFile(path) as paf:
for record in paf:
if record.is_primary():
identity = record.blast_identity()
identities.append(identity)
Another use case might be that we want to get the identifiers of all records aligned to a specific contig, but only keep the alignments where more than 50% of the query (read) is aligned.
from typing import List
from pafpy import PafFile
path = "path/to/sample.paf"
contig = "chr1"
min_covg = 0.5
identifiers: List[str] = []
with PafFile(path) as paf:
for record in paf:
if record.tname == contig and record.query_coverage > min_covg:
identifiers.append(record.qname)
If you would like to contribute to pafpy
, checkout CONTRIBUTING.md
.