@@ -15,9 +15,14 @@ numbers reported in the two papers
```
and
ARXIV paper here
```
@inproceedings{JDJ17,
Author = {Jeff Johnson and Matthijs Douze and Herv{\'e} J{\'e}gou},
journal={arXiv preprint arXiv:XXXXXX},,
Title = {Billion-scale similarity search with GPUs},
Year = {2017},
}
```
Note that the numbers (especially timings) change slightly due to improvements in the implementation, different machines, etc.
...
...
@@ -218,7 +223,7 @@ The run produces two warnings:
- the clustering complains that it does not have enough training data, there is not much we can do about this.
- the add() function complains that there is an inefficient memory allocation, but this is a concern only when it happens often.
- the add() function complains that there is an inefficient memory allocation, but this is a concern only when it happens often, and we are not benchmarking the add time anyways.
### Clustering on MNIST8m
...
...
@@ -238,6 +243,32 @@ total runtime: 140.615 s
### search on SIFT1B
The script [`bench_gpu_1bn.py`] runs multi-gpu searches on the two 1-billion vector datasets we considered. It is more complex than the previous scripts, because it supports many search options and decomposes the dataset build process in Python to exploit the best possible CPU/GPU parallelism and GPU distribution.
The search results on SIFT1B in the "GPU paper" can be obtained with