You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The problem I'm facing is rows returned from function similarity_search always less than expected.
For example, I have 100 rows match filter's condition in table langchain_pg_embedding, by executing
I expected length of docs should be 100. However, per different query_context, length of docs are various but less than 100 all the time, like 43 or 67.
So I enabled log for sqlchemy.engine, and got SQL it generated here:
2024-07-2520:21:47,269-sqlalchemy.engine.Engine - INFO -SELECTlangchain_pg_embedding.idAS langchain_pg_embedding_id, langchain_pg_embedding.collection_idAS langchain_pg_embedding_collection_id, langchain_pg_embedding.embeddingAS langchain_pg_embedding_embedding, langchain_pg_embedding.documentAS langchain_pg_embedding_document, langchain_pg_embedding.cmetadataAS langchain_pg_embedding_cmetadata, langchain_pg_embedding.embedding<-> %(embedding_1)s AS distance
FROM langchain_pg_embedding JOIN langchain_pg_collection ONlangchain_pg_embedding.collection_id=langchain_pg_collection.uuidWHERElangchain_pg_embedding.collection_id= %(collection_id_1)s::UUID AND jsonb_path_match(langchain_pg_embedding.cmetadata, CAST(%(param_1)s AS JSONPATH), CAST(%(param_2)s::JSONB AS JSONB)) ORDER BY distance ASCLIMIT %(param_3)s::INTEGER2024-07-2520:21:47,269-sqlalchemy.engine.Engine - INFO - [generated in0.00126s] {'embedding_1': '[0.002203210722655058,-0.0034537874162197113,-0.04540061205625534,0.016271447762846947,0.03849130868911743,0.035241544246673584,0.013832494616508484, ... (21402 characters truncated) ... .03421637415885925,-0.03275654464960098,-0.0023437547497451305,0.0021014492958784103,-0.01060000341385603,0.008099209517240524,0.0024598073214292526]', 'collection_id_1': UUID('24b25f49-4e90-4073-9b4b-f5fd91b83c61'), 'param_1': '$.user_name == $value', 'param_2': Jsonb({'value': 'user_1'}), 'param_3': 100}
After replace parameters into sql and execute it in pgAdmin, 100 rows are returned.
Then I dig into code, tried to comment filter in sqlalchemy query vectorstores.py
I'm not familiar with sqlalchemy so no clues at all. Could you please take a look on what's happening here? Am I using the search function in wrong way?
Thanks!
The text was updated successfully, but these errors were encountered:
Hi, thanks in advance for this amazing project!
The problem I'm facing is rows returned from function similarity_search always less than expected.
For example, I have 100 rows match filter's condition in table langchain_pg_embedding, by executing
I expected length of docs should be 100. However, per different query_context, length of docs are various but less than 100 all the time, like 43 or 67.
So I enabled log for sqlchemy.engine, and got SQL it generated here:
After replace parameters into sql and execute it in pgAdmin, 100 rows are returned.
Then I dig into code, tried to comment filter in sqlalchemy query
vectorstores.py
All 100 rows are returned.
I'm not familiar with sqlalchemy so no clues at all. Could you please take a look on what's happening here? Am I using the search function in wrong way?
Thanks!
The text was updated successfully, but these errors were encountered: