Skip to content
This repository has been archived by the owner on Nov 13, 2024. It is now read-only.

[Bug]: After milvus2x migration, the number of entities is inconsistent #94

Open
charleskangzq opened this issue Aug 9, 2024 · 2 comments

Comments

@charleskangzq
Copy link

Current Behavior

I have two milvus instance , and want to migrate data from one to another. Instances' information is as blow :
source milvus: 2.3.12 standalone
target milvus:2.3.12 cluster

I tried to migrate two collections . One has 41531 entities and the other has 1840419 entities . After migration with 500 bufferSize , one has 39663 and the other has 1789625 entities .

Meanwhile, I tried to migrate with different dumper.worker.reader.bufferSize, the target milvus entity number is different .

Expected Behavior

Target milvus has same entities with source milvus.

Steps To Reproduce

No response

Environment

No response

Anything else?

No response

@wenhuiZilliz
Copy link
Collaborator

hi @charleskangzq , Now Milvus-migration use Iterator api to read source data, iterator will remove duplicate primary key data, so will occur inconsistent if your data have same pk data.

@wenhuiZilliz
Copy link
Collaborator

@charleskangzq and that total entities size inaccurate statistics In earlier tool versions (not subtracting the deleted quantity), now can use latest version of migration tool that can accurate statistics total entities size.

and about same pk data, If you have suspect list of duplicated PK values, you could just use query count(*) with expr like pk == pk_value to check whether it's duplicated no not.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants