Dynamic Approximate Nearest Neighbour Benchmarks

Dynamic Approximate Nearest Neighbour Benchmarks

Install the framework

Unpackage code and setup python environment

> tar xvzf dyann.tar.gz
> cd dyann
> /apps/python/3.9.X/bin/python3 -m venv env
> source env/bin/activate
> pip install -r requirements.txt --no-cache-dir

Running the code

Quick test

> python download.py data=[datacol_quick]
> python run.py data=[datacol_quick] algo=[linear,hnsw]
> python plot-pareto.py data=[datacol_quick] algo=[linear,hnsw]
> python plot-algo.py data=[datacol_quick] algo=[hnsw]

Preload all datasets and pregenerate all groundtruth (could take hours, ensure at least 30GB space)

> python download.py data=[datacol,datacol_lerp,datacol_efreq,datacol_esfreq]
> python download.py data=[featlearn,featlearn_lerp,featlearn_efreq,featlearn_esfreq]

Generate all benchmarking results (can easily take days or weeks, best run in parallel with a job scheduler)

> python run.py data=[datacol,datacol_lerp,datacol_efreq,datacol_esfreq] algo=[linear,annoy,hnsw,ivfpq,scann,kdtree]
> python run.py data=[featlearn,featlearn_lerp,featlearn_efreq,featlearn_esfreq] algo=[linear,annoy,hnsw,ivfpq,scann,kdtree]

Adding new datasets

A template file for new datasets is provided at ./dyann/data/template.py

Usage Instructions:

Copy template.py and change the filename and class name for your new dataset
Update ./dyann/data/proxy.py to include the names you have chosen
Fill in each of the TODO items (refer to existing datasets for hints if needed)
Create any number of configuration sets in ./conf/data/ with name property set to this filename and scale property providing an optional parameter sweep

Adding new ANN algorithms

A template file for new datasets is provided at ./dyann/algo/template.py

Usage Instructions:

Copy template.py and change the filename and class name for your new ANN algorithm
Update ./dyann/algo/proxy.py to include the names you have chosen
Fill in each of the TODO items (refer to existing algorithms for hints if needed)
Create both the build and search configuration files in ./conf/algo/ with name property set to this filename the lists of parameters for the build and query properties will be swept

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dynamic Approximate Nearest Neighbour Benchmarks

Install the framework

Running the code

Adding new datasets

Adding new ANN algorithms

Benchmarks for Static ANN

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
conf		conf
dyann		dyann
download.py		download.py
plot-algo.py		plot-algo.py
plot-pareto.py		plot-pareto.py
readme.md		readme.md
requirements.txt		requirements.txt
run.py		run.py

data61/DyANN

Folders and files

Latest commit

History

Repository files navigation

Dynamic Approximate Nearest Neighbour Benchmarks

Install the framework

Running the code

Adding new datasets

Adding new ANN algorithms

Benchmarks for Static ANN

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages