clkhash is a Python implementation of cryptographic linkage key hashing as described by Rainer Schnell, Tobias Bachteler, and Jörg Reiher in A Novel Error-Tolerant Anonymous Linking Code.
Install clkhash with all dependencies using pip:
pip install clkhash
https://clkhash.readthedocs.io
To hash a CSV file of entities using the default schema:
from clkhash import clk, randomnames
fake_pii_schema = randomnames.NameList.SCHEMA
clks = clk.generate_clk_from_csv(open('fake-pii-out.csv','r'), 'secret', fake_pii_schema)
See Anonlink Client for a command line interface to clkhash.
Clkhash, and the wider Anonlink project is designed, developed and supported by CSIRO's Data61. If you use any part of this library in your research, please cite it using the following BibTex entry::
@misc{Anonlink,
author = {CSIRO's Data61},
title = {Anonlink Private Record Linkage System},
year = {2017},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/data61/clkhash}},
}