You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the solution with the same tokens jsonl file, the output json of clone groups varies upon each execution.
Is there a more deterministic approach to near duplicate detection?
Cheers
Sky
The text was updated successfully, but these errors were encountered:
Apologies for just seeing this issue... If the parallelism is disabled, then you should be able to get a deterministic output. Alternatively, consider using this code which should also yield deterministic results.
Running the solution with the same tokens jsonl file, the output json of clone groups varies upon each execution.
Is there a more deterministic approach to near duplicate detection?
Cheers
Sky
The text was updated successfully, but these errors were encountered: