You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There can be a problem when embedding vectors(ex. msmarco-distilbert-base-tas-b; say it's similarity function is cosine similarity) are indexed if we map the knn_vector field with a different space_type. (ex. L2)
The distance calculated from the embedding model's weights and the vector distance from a HNSW Graph can differ, leading to inaccurate search scores.
This means that since OpenSearch stores HNSW Graph structures of each segment created by Faiss/NMSLIB/Lucene, search results from the graph could vary depending on the space_type.
What solution would you like?
Are there any benefits to using different space_type values with the similarity function of embedding models?
I suggest displaying warning messages in the above scenario to alert users to potential inaccuracies.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
knn_vector
field with a different space_type. (ex. L2)space_type
.What solution would you like?
The text was updated successfully, but these errors were encountered: