-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion about the redis vector DB index algorithm, changed from HNSW to FLAT #840
Comments
Hi @gavinlichn , The PR is to remove the hard length limitation and make the vecdb initialization more simple. I do not think explicitly initializing redis with the schema 768 or 1024 is a concise way. With that PR, if users use BGE base, the redis can automatically accept embedding with length 768, otherwise if users use BGE large, the redis can automatically accept embedding with length 1024. Users do not need to know/change the schema length if they use another embedding model. However as you said, the schema also contains a non-default index algorithm. Could you please give us some data or reason about how is HNSW faster/better than the default ones? Or can we pass a parameter to Redis to change the default indexing algorithm, which I think is more simple? |
As the intent of PR is to remove the length limitation. prefer to keep the change clean, and not touch more logic. For the algorithm comparison, we can refer to Redis' document:
https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/vectors/#hnsw-index |
I agree that let users pass a customized schema is reasonable. I think keep the default behaviors (FLAT, accuracy first) now and allow advanced users to set a schema (maybe set the None as default i.e. |
@gavinlichn , for the above Spycsh's solution. |
To include schema by environment variable is reasonable, original designed is similar. |
Aware that dataprep/redis/langchain vector DB index algorithm is FLAT. But remembered we use HNSW before.
Investigating the code, it caused by the removing of index_schema, changed with PR #347
If index_schema removed, redis fall back to default index algorith(FLAT)
Considering that may impact the performance.
Can you help to clarify the background of this change please? @Spycsh
The text was updated successfully, but these errors were encountered: