#include <IndexIVFPQ.h>
Public Member Functions | |
| IndexIVFPQ (Index *quantizer, size_t d, size_t nlist, size_t M, size_t nbits_per_idx) | |
| void | add_with_ids (idx_t n, const float *x, const long *xids=nullptr) override |
| void | encode_vectors (idx_t n, const float *x, const idx_t *list_nos, uint8_t *codes) const override |
| void | add_core_o (idx_t n, const float *x, const long *xids, float *residuals_2, const long *precomputed_idx=nullptr) |
| void | train_residual (idx_t n, const float *x) override |
| trains the product quantizer | |
| void | train_residual_o (idx_t n, const float *x, float *residuals_2) |
| same as train_residual, also output 2nd level residuals | |
| void | reconstruct_from_offset (long list_no, long offset, float *recons) const override |
| size_t | find_duplicates (idx_t *ids, size_t *lims) const |
| void | encode (long key, const float *x, uint8_t *code) const |
| void | encode_multiple (size_t n, long *keys, const float *x, uint8_t *codes, bool compute_keys=false) const |
| void | decode_multiple (size_t n, const long *keys, const uint8_t *xcodes, float *x) const |
| inverse of encode_multiple | |
| InvertedListScanner * | get_InvertedListScanner (bool store_pairs) const override |
| get a scanner for this index (store_pairs means ignore labels) | |
| void | precompute_table () |
| build precomputed table More... | |
Public Member Functions inherited from faiss::IndexIVF | |
| IndexIVF (Index *quantizer, size_t d, size_t nlist, size_t code_size, MetricType metric=METRIC_L2) | |
| void | reset () override |
| removes all elements from the database. | |
| void | train (idx_t n, const float *x) override |
| Trains the quantizer and calls train_residual to train sub-quantizers. | |
| void | add (idx_t n, const float *x) override |
| Calls add_with_ids with NULL ids. | |
| virtual void | search_preassigned (idx_t n, const float *x, idx_t k, const idx_t *assign, const float *centroid_dis, float *distances, idx_t *labels, bool store_pairs, const IVFSearchParameters *params=nullptr) const |
| virtual void | search (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels) const override |
| void | reconstruct (idx_t key, float *recons) const override |
| void | reconstruct_n (idx_t i0, idx_t ni, float *recons) const override |
| void | search_and_reconstruct (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const override |
| long | remove_ids (const IDSelector &sel) override |
| Dataset manipulation functions. | |
| void | check_compatible_for_merge (const IndexIVF &other) const |
| virtual void | merge_from (IndexIVF &other, idx_t add_id) |
| virtual void | copy_subset_to (IndexIVF &other, int subset_type, long a1, long a2) const |
| size_t | get_list_size (size_t list_no) const |
| void | make_direct_map (bool new_maintain_direct_map=true) |
| double | imbalance_factor () const |
| 1= perfectly balanced, >1: imbalanced | |
| void | print_stats () const |
| display some stats about the inverted lists | |
| void | replace_invlists (InvertedLists *il, bool own=false) |
| replace the inverted lists, old one is deallocated if own_invlists | |
Public Member Functions inherited from faiss::Index | |
| Index (idx_t d=0, MetricType metric=METRIC_L2) | |
| virtual void | range_search (idx_t n, const float *x, float radius, RangeSearchResult *result) const |
| void | assign (idx_t n, const float *x, idx_t *labels, idx_t k=1) |
| void | compute_residual (const float *x, float *residual, idx_t key) const |
| void | display () const |
Public Member Functions inherited from faiss::Level1Quantizer | |
| void | train_q1 (size_t n, const float *x, bool verbose, MetricType metric_type) |
| Trains the quantizer and calls train_residual to train sub-quantizers. | |
| Level1Quantizer (Index *quantizer, size_t nlist) | |
Public Attributes | |
| bool | by_residual |
| Encode residual or plain vector? | |
| ProductQuantizer | pq |
| produces the codes | |
| bool | do_polysemous_training |
| reorder PQ centroids after training? | |
| PolysemousTraining * | polysemous_training |
| if NULL, use default | |
| size_t | scan_table_threshold |
| use table computation or on-the-fly? | |
| int | polysemous_ht |
| Hamming thresh for polysemous filtering. | |
| int | use_precomputed_table |
| if by_residual, build precompute tables More... | |
| std::vector< float > | precomputed_table |
Public Attributes inherited from faiss::IndexIVF | |
| InvertedLists * | invlists |
| Acess to the actual data. | |
| bool | own_invlists |
| size_t | code_size |
| code size per vector in bytes | |
| size_t | nprobe |
| number of probes at query time | |
| size_t | max_codes |
| max nb of codes to visit to do a query | |
| bool | maintain_direct_map |
| map for direct access to the elements. Enables reconstruct(). | |
| std::vector< long > | direct_map |
Public Attributes inherited from faiss::Index | |
| int | d |
| vector dimension | |
| idx_t | ntotal |
| total nb of indexed vectors | |
| bool | verbose |
| verbosity level | |
| bool | is_trained |
| set if the Index does not require training, or if training is done already | |
| MetricType | metric_type |
| type of metric this index uses for search | |
Public Attributes inherited from faiss::Level1Quantizer | |
| Index * | quantizer |
| quantizer that maps vectors to inverted lists | |
| size_t | nlist |
| number of possible key values | |
| char | quantizer_trains_alone |
| bool | own_fields |
| whether object owns the quantizer | |
| ClusteringParameters | cp |
| to override default clustering params | |
| Index * | clustering_index |
| to override index used during clustering | |
Static Public Attributes | |
| static size_t | precomputed_table_max_bytes = ((size_t)1) << 31 |
| 2G by default, accommodates tables up to PQ32 w/ 65536 centroids | |
Additional Inherited Members | |
Public Types inherited from faiss::Index | |
| typedef long | idx_t |
| all indices are this type | |
Inverted file with Product Quantizer encoding. Each residual vector is encoded as a product quantizer code.
Definition at line 35 of file IndexIVFPQ.h.
| void faiss::IndexIVFPQ::add_core_o | ( | idx_t | n, |
| const float * | x, | ||
| const long * | xids, | ||
| float * | residuals_2, | ||
| const long * | precomputed_idx = nullptr |
||
| ) |
same as add_core, also:
Definition at line 221 of file IndexIVFPQ.cpp.
|
overridevirtual |
Same as add, but stores xids instead of sequential ids.
The default implementation fails with an assertion, as it is not supported by all indexes.
| xids | if non-null, ids to store for the vectors (size n) |
Reimplemented from faiss::Index.
Reimplemented in faiss::IndexIVFPQR.
Definition at line 183 of file IndexIVFPQ.cpp.
| void faiss::IndexIVFPQ::encode_multiple | ( | size_t | n, |
| long * | keys, | ||
| const float * | x, | ||
| uint8_t * | codes, | ||
| bool | compute_keys = false |
||
| ) | const |
Encode multiple vectors
| n | nb vectors to encode |
| keys | posting list ids for those vectors (size n) |
| x | vectors (size n * d) |
| codes | output codes (size n * code_size) |
| compute_keys | if false, assume keys are precomputed, otherwise compute them |
Definition at line 150 of file IndexIVFPQ.cpp.
|
overridevirtual |
Encodes a set of vectors as they would appear in the inverted lists
| list_nos | inverted list ids as returned by the quantizer (size n). -1s are ignored. |
| codes | output codes, size n * code_size |
Implements faiss::IndexIVF.
Definition at line 207 of file IndexIVFPQ.cpp.
| size_t faiss::IndexIVFPQ::find_duplicates | ( | idx_t * | ids, |
| size_t * | lims | ||
| ) | const |
Find exact duplicates in the dataset.
the duplicates are returned in pre-allocated arrays (see the max sizes).
lims limits between groups of duplicates (max size ntotal / 2 + 1) ids ids[lims[i]] : ids[lims[i+1]-1] is a group of duplicates (max size ntotal)
Definition at line 1089 of file IndexIVFPQ.cpp.
| void faiss::IndexIVFPQ::precompute_table | ( | ) |
build precomputed table
Precomputed tables for residuals
During IVFPQ search with by_residual, we compute
d = || x - y_C - y_R ||^2
where x is the query vector, y_C the coarse centroid, y_R the refined PQ centroid. The expression can be decomposed as:
d = || x - y_C ||^2 + || y_R ||^2 + 2 * (y_C|y_R) - 2 * (x|y_R)
term 1 term 2 term 3
When using multiprobe, we use the following decomposition:
Since y_R defined by a product quantizer, it is split across subvectors and stored separately for each subvector. If the coarse quantizer is a MultiIndexQuantizer then the table can be stored more compactly.
At search time, the tables for term 2 and term 3 are added up. This is faster when the length of the lists is > ksub * M.
Definition at line 364 of file IndexIVFPQ.cpp.
|
overridevirtual |
Reconstruct a vector given the location in terms of (inv list index + inv list offset) instead of the id.
Useful for reconstructing when the direct_map is not maintained and the inv list offset is computed by search_preassigned() with store_pairs set.
Reimplemented from faiss::IndexIVF.
Reimplemented in faiss::IndexIVFPQR.
Definition at line 311 of file IndexIVFPQ.cpp.
| std::vector<float> faiss::IndexIVFPQ::precomputed_table |
if use_precompute_table size nlist * pq.M * pq.ksub
Definition at line 60 of file IndexIVFPQ.h.
| int faiss::IndexIVFPQ::use_precomputed_table |
if by_residual, build precompute tables
Precompute table that speed up query preprocessing at some memory cost =-1: force disable =0: decide heuristically (default: use tables only if they are < precomputed_tables_max_bytes) =1: tables that work for all quantizers (size 256 * nlist * M) =2: specific version for MultiIndexQuantizer (much more compact)
Definition at line 55 of file IndexIVFPQ.h.
1.8.5