Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PICARD-2032: Set albumsort/titlesort tags #369

Merged
merged 7 commits into from
Jan 27, 2024

Conversation

twodoorcoupe
Copy link
Contributor

This plugin attempts to fix [PICARD-2032]. For each track or album it first checks if there are any sort names already available in the aliases, otherwise it swaps the title's prefix.

Then it also partly fixes [PICARD-2779] by providing script functions:

  • $swapprefix_lang
  • $delprefix_lang
  • $title_lang

So far, the languages available are English, Spanish, Italian, French, German, and Portuguese. For each language there's a list of articles that are used as prefixes and a list of short prepositions and conjunctions that are used in $title_lang. These I wrote myself by looking online, so they are prone to errors.

$swapprefix_lang and $delprefix_lang do the same thing as their original counterparts but take languages as inputs instead of prefixes. $title_lang returns the text in title case by keeping the small words of each language in lower case.
During tagging, if the title's prefix is swapped, only the track's or album's language are considered. If none are found then all of the available ones are considered.

Checking the aliases for each track and album can be turned off in the options, as well as ignoring titles in all caps for $title_lang.
It's my first time contributing so I apologize if I missed anything important.

Copy link
Contributor

@Sophist-UK Sophist-UK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the other comments, I would like to see some unit tests for the new script functions.

plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
self._swapprefix(metadata, "album")


def _create_prefixes_list(languages = None, is_title = False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is coded incorrectly. This function is called each and every time a script function is called and simply repeats the exact same code each time.

IMO the set of words for all languages should be calculated once, at load time rather than each time a script function is called. Personally I would store it in the same dict as the separate languages with an empty string as the key.

@Sophist-UK
Copy link
Contributor

P.S. @twodoorcoupe Giorgio - Many thanks for coding this when I don't have the time. 👍

@twodoorcoupe
Copy link
Contributor Author

Thank you @Sophist-UK for taking the time to review this. I made the changes you requested and added some unit tests. Now, when a new combination of languages is used, the resulting list of prefixes or minor words is stored so as not to recalculate it every time.

plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Sophist-UK Sophist-UK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite for the length and number of comments here, these are relatively minor tweaks to what is already pretty good code. So thanks again for writing this.

plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
else:
self._swapprefix(metadata, field)

def _swapprefix(self, metadata, field):
Copy link
Contributor

@Sophist-UK Sophist-UK Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would IMO be better of as a function taking the field value and returning a sorted value, leaving the caller to get the unsorted value and store the sorted value i.e.

    def _swapprefix(self, field):
        ....
        return func_swapprefix(None, metadata[field], *prefixes)

    metadata["titlesort"] = self._swapprefix(metadata["title"])

Also, using return enables early return improved readability avoiding multiple nested if/else i.e.

if something:
    ...
    return first_something

if something_else:
    return second_something

return third something

plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
def _format_languages(self, languages):
languages = [lang.lower()[:3] for lang in languages]
languages = [lang for lang in languages if lang in _articles.keys()]
return tuple(languages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unclear what benefit there is to converting languages into a tuple rather than returning it as a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so the tuple can be used as a key for the cache dictionary, since it's immutable.

plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
'the'
"""
prefixes = self.find_prefixes(languages)
return func_delprefix(parser, text, *prefixes) if text else ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

if text.upper() == text and config.setting[KEEP_ALLCAPS]:
return text
minor_words = self.find_minor_words(languages)
return self._title_case(text, minor_words) if text else ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

}
_articles[""] = [value for values in _articles.values() for value in values]

# Prepositions and conjunctions with 3 letters or less.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also be helpful say that these are words which aren't title-cased.

(str): The sort name of the first useful alias it finds. None if
none are found.

For example, "The Beatles" has alias "Le Double Blanc" with sort name
Copy link
Contributor

@Sophist-UK Sophist-UK Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got confused here as to why the 1960s' Hip Beat Combo "The Beatles" would be called "The Double White" in French, particularly considering that the group consisted of 4 white blokes not two. Then it clicked that you mean the nicknamed "White Album" which was officially entitled "The Beatles". To avoid anyone as stupid as me thinking the same, you might want to say 'For example, the album "The Beatles"...'.

plugins/enhanced_titles/__init__.py Show resolved Hide resolved
@Sophist-UK
Copy link
Contributor

@zas, can you enable workflow for @twodoorcoupe so that we can see his unit tests running?

@twodoorcoupe
Copy link
Contributor Author

All the tweaks you suggested should be implemented now. Many thanks for all the tips.

@twodoorcoupe twodoorcoupe requested a review from zas January 23, 2024 10:19
Copy link
Collaborator

@zas zas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things look good to me, just few minor comments

plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
@zas zas requested a review from phw January 23, 2024 13:07
Copy link
Member

@phw phw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code and functionality look good to me, thanks a lot for this plugin.

I wonder though, whether the script functions should also try to use the language from the metadata instead of using essentially all if no language is explicitly given. The same as the swapprefix in SortTagger already does.

Or is there some specific reason not to do this?

@twodoorcoupe
Copy link
Contributor Author

Ah that makes sense and now I remembered it was actually proposed by @Sophist-UK in [PICARD-2779] as an enhancement of $swapprefix. I moved the logic for finding the languages from _swapprefix in SortTagger to a new function find_languages that is then used by the script functions. Thank you @phw for the help.

@twodoorcoupe twodoorcoupe requested a review from phw January 25, 2024 09:02
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Outdated Show resolved Hide resolved
plugins/enhanced_titles/__init__.py Show resolved Hide resolved
@Sophist-UK
Copy link
Contributor

I think this looks good to go too.

Copy link
Collaborator

@zas zas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Sophist-UK
Copy link
Contributor

@twodoorcoupe Thank you again for all your efforts on this - including your willingness to rework it several times to satisfy the petty demands of grumpy old coders like me.

Copy link
Member

@phw phw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. Looks good to me as well. I'll merge and the plugin will be available from the website sometime the next 12 hours.

@phw phw merged commit b4c4cc3 into metabrainz:2.0 Jan 27, 2024
6 checks passed
@twodoorcoupe twodoorcoupe deleted the enhanced_titles branch February 6, 2024 10:37
@phw phw added the new plugin label Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants