Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deezerart: Use astrcmp and make matching threshold configurable #360

Merged
merged 3 commits into from
Sep 15, 2023

Conversation

phw
Copy link
Member

@phw phw commented Sep 15, 2023

The deezerart plugin used Python's difflib for string similarity comparison. This has some problems:

  1. difflib ist not part of the Picard binary packages for macOS and Windows, as the bundled Python only includes those modules actually used

  2. The SequenceMatcher.quick_ratio only compares length and characters, not sequences (which means order of characters does not matter). See for example:

    >>> from difflib import SequenceMatcher
    >>> SequenceMatcher(None, "michael jackson", "joachim slacken").quick_ratio()
    1.0
    >>> from picard.util.astrcmp import astrcmp
    >>> astrcmp("michael jackson", "joachim slacken")
    0.3333333134651184
    

Use Picard's astrcmp function instead. A value of 0.6 seemed sensible in my tests. But a configuration option was added so users can tweak the threshold if they get bad results.

This provides both better comparison and also makes the plugin compatible
with Picard packaged for Windows / macOS (where difflib is not included).
@phw phw changed the title Deezerart: Use astrcmp and male matching threshold configurable Deezerart: Use astrcmp and make matching threshold configurable Sep 15, 2023
@phw phw requested a review from zas September 15, 2023 06:45
Copy link
Collaborator

@zas zas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@phw phw merged commit 5a18c78 into metabrainz:2.0 Sep 15, 2023
6 checks passed
@phw phw deleted the deezerart-astrcmp branch September 15, 2023 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants