Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for local lyric uri #53

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

sanderr
Copy link

@sanderr sanderr commented Jul 29, 2024

The current local lyric provider is a bit naive in its searching. I believe this is a known issue, for which the proper solution might be quite complex. This PR proposes an easy win to cover what I assume is a common case.

I believe that when the local lyric source is used, it will often be used in conjunction with a local music library (this is my personal set up, and it currently doesn't work very well with sptlrx). Therefore, it seems worthwhile to attempt to fetch the lyrics from a file with the same name as the local music file (if there is any), with the extension replaced by .lrc. If no such file exists, we would fall back to the current mechanism.

This PR implements this behavior for the MPD and MPRIS sources. To this end it contains a slight refactor to encapsulate the lyrics providers' input in a TrackMetadata struct. An additional Uri field is added there, which will be set if a local file is known for the currently playing track. The local lyric provider will then use this Uri to attempt to fetch an associated .lrc file.

I should disclaim that I have no prior experience with Go, so I'm unfamiliar with its conventions. I tried to infer from the rest of the code base, but please feel free to point out any code smells.

Additional motivation in #52 and #40.

@sanderr sanderr changed the title Local lyric uri Add support for local lyric uri Jul 30, 2024
@YanceyChiew
Copy link
Contributor

Apart from the minor issues mentioned above, the logic of automatically selecting the lyrics path based on the uri works just fine.

However, if we want to make this function that can obtain lyrics without setting a lyrics directory take effect, we need to fill in a valid lyrics directory first, preferably an empty directory to reduce unnecessary overhead. Maybe we need a more intuitive configuration method.

In addition, the current code directly obtains the path from the URL as the uri, and will obtain a null value for non-local paths. If the music files are hosted on servers such as dlna or smb, these servers may also provide lyrics. Currently, only local.go provides URL-based lyrics, but retaining the original URL in the globally used structure may be more conducive to expanding other lyrics providers in the future.

The author of this project said that he is considering a large-scale rewrite of the project. Maybe you can ask the author about his plans first so that the submissions can be better migrated to the new version.

@sanderr
Copy link
Author

sanderr commented Aug 4, 2024

Thank you for your feedback.

However, if we want to make this function that can obtain lyrics without setting a lyrics directory take effect, we need to fill in a valid lyrics directory first, preferably an empty directory to reduce unnecessary overhead. Maybe we need a more intuitive configuration method.

I'm not sure I understand what you're saying. Do you mean that the current implementation still requires (and fals back to) a local lyric directory to be set, even if the local lyric source is only used for this new "feature", and you're suggesting to change that?

retaining the original URL in the globally used structure may be more conducive to expanding other lyrics providers in the future.

Very good point, I can make that change.

The author of this project said that he is considering a large-scale rewrite of the project. Maybe you can ask the author about his plans first so that the submissions can be better migrated to the new version.

Another good point. Though aa I understood this the big plans are for the long run and I believe this refactor could be considered sufficiently small that it could just fill the gap in the current implementation without tailoring to the big rework. But perhaps it could be taken into account. @raitonoberu do you have any comments on this aspect?

@YanceyChiew
Copy link
Contributor

Do you mean that the current implementation still requires (and fals back to) a local lyric directory to be set, even if the local lyric source is only used for this new "feature", and you're suggesting to change that?

Yeah, that's exactly what I mean. local.go:findFile will only execute if the lyrics directory is set to a valid path.

It would be great if the lyrics could be found without requiring the user to configure the lyrics directory, or simply with a description like "auto". However, if the user has manually configured the lyrics directory, but the finally successfully returned lyrics path has nothing to do with it, which is counterintuitive.

@raitonoberu
Copy link
Owner

Sorry for the late responses. What you are doing here is pretty much what I would do myself, and I love it. It's a good short term solution so we can do a release without waiting too long :)

As part of the rewrite, I plan to collect all the metadata possible from the player, and then let a bunch of lyrics providers deal with it. So, if local provider is enabled and we got a URI, try searching for the .lrc file, try extracting the embedded lyrics, etc. Nothing? Fallback to lrclib or other providers. And they should all be fairly customizable. Guess I'll do some sort of design review so you guys can get involved and stop me from making bad decisions.

@sanderr
Copy link
Author

sanderr commented Aug 5, 2024

Glad to hear you like it. I'd still like to implement the suggestions @YanceyChiew made, but I might not have much time in the next two weeks or so so this will likely be stale during that time.

Looking forward to the design of the rewrite. If you aim to give multiple sources a go with the same track, perhaps a confidence score or something could be useful? It may be difficult to define for some sources, but e.g. a local lrc file with matching name or an embedded lyric would be close to 100%, an lrclib match would be pretty good because it takes the track length into account and could perhaps scale based on the amount of available metadata, ... Just a brainstorm-grade idea, it may or may not prove useful in practice.

@sanderr
Copy link
Author

sanderr commented Aug 25, 2024

@YanceyChiew I processed your comments. I opted for a simple enabled flag on the local source and I expanded the readme on it a bit. I started out with an auto field as you suggested, but while implementing it I found it presented more complexity to the end user (the two modes sort of mesh in the sense that setting a folder also allows for exact-matching relative paths, making it more than a simple binary option), with little gain (the exact-match search seems like an always-safe first try to me).

@raitonoberu I think the PR is ready for final review now.

@YanceyChiew
Copy link
Contributor

the two modes sort of mesh in the sense that setting a folder also allows for exact-matching relative paths, making it more than a simple binary option), with little gain (the exact-match search seems like an always-safe first try to me

I still think that the priority(services/local/local.go:findFile) of the lyrics directory manually specified by the user should not be overridden by the path automatically inferred based on the URL.

However, I can also understand if the manually specified directory is regarded as a fallback method. At present, in terms of configuration logic, there will no longer be a situation where a user specifically specifies a directory but enables the function that ignores this directory first. This should be enough.

————

In addition, I mentioned before that there is a bug in services/local/local.go:33, which will cause the lyrics directory specified by the absolute path to be unable to create an Index, and cause sptlrx to exit with an error, but it seems that the new commit did not fix it.

@sanderr
Copy link
Author

sanderr commented Aug 27, 2024

I can also understand if the manually specified directory is regarded as a fallback method.

Indeed, that's how I see it. I can't think of a case where the automatically inferred path would not be preferable over the hit-or-miss search. But more than a fallback, it also serves as a root for any players that report a relative path, so it serves a purpose in both modes. But I have to confess I'm not 100% satisfied with the way it's set up now. It is however the best of the options I could come up with. If you have a concrete alternative in mind I'll definitely consider it.

I mentioned before that there is a bug in services/local/local.go:33

I will have a look at it, thanks for pointing it out. I don't see it anywhere earlier in the thread though, is it possible you accidentally deleted part of your initial comment? It starts with "Apart from the minor issues mentioned above". I assumed this referred to the PR description, but perhaps there's just a part missing? Are there any other pieces of feedback I missed?

@sanderr
Copy link
Author

sanderr commented Aug 27, 2024

@YanceyChiew oh wow, that (bug) was a stupid one. Fixed it in 867a183.

@YanceyChiew
Copy link
Contributor

I will have a look at it, thanks for pointing it out. I don't see it anywhere earlier in the thread though, is it possible you accidentally deleted part of your initial comment?

Oh, sorry, I just realized that I didn't send it successfully🤣.

————

if track.Uri != "" {
var absUri string
if filepath.IsAbs(track.Uri) {
// Uri is already absolute
absUri = track.Uri
} else if c.folder != "" {
// Uri is relative to local music directory
absUri = filepath.Join(c.folder, track.Uri)
} else {
// Can not handle relative uri without folder configured
absUri = ""
}
if absUri != "" {
absLyricsUri := strings.TrimSuffix(absUri, filepath.Ext(absUri)) + ".lrc"
if _, err := os.Stat(absLyricsUri); err == nil {
return absLyricsUri
}
}
}

I haven't compiled this code yet, but it looks like there's a glitch.

This } else if c.folder != "" { blocks the first exact match of a music file with an absolute path but whose lyric placed in the user-specified lyrics directory.

It's not a big problem, and the next mechanism based on splitting words and scoring will solve it, but if exact matching is done within the lyrics folder based on it's basename, it will provide a more unified experience.

In addition, c.index already contains the paths of all lrc files in the lyrics directory, which can be directly compared without os.Stat(). This is one of the reasons why I think its priority should be higher than the path inferred based on the song Url.

@sanderr
Copy link
Author

sanderr commented Aug 27, 2024

This } else if c.folder != "" { blocks the first exact match of a music file with an absolute path but whose lyric placed in the user-specified lyrics directory.

Could you elaborate on that? I'm not sure I understand what you're saying. As I see it (and I may be missing something), there are the following possible scenarios:

  1. We know the exact path of the music file (e.g. mpris source) -> trivial, we just check for associated lrc file. Doesn't matter if the path is below the configured folder or not since its an absolute (full) path.
  2. We have a relative path (e.g. mpd). We assume (and hope) that it's relative to the configured directory, so we check if an lrc file exists there.
  3. We have no path information -> fall back to the search-based approach

The scenario where we have a full path that happens to be in the user-specified directory, then falls under scenario 1, as modeled in the if-else statement. Am I missing something here?

Side note: the snippet you linked is not the latest commit, FYI. It is functionally the same though, so your comment still applies.

In addition, c.index already contains the paths of all lrc files in the lyrics directory, which can be directly compared without os.Stat(). This is one of the reasons why I think its priority should be higher than the path inferred based on the song Url.

Again, I'm not certain I understand what you're suggesting. Do you mean you'd prefer to compare the uri derived from track.Uri with the index to check if the file exists, rather than to call os.Stat()?

If so, while true that the index already contains all lrc file information, it is simply a list (/ array / whatever it's called in Go) of struct pointers. Checking if it contains an lrc file is O(n), and less readable imo than a simple os.Stat(). I could refactor it, but I don't think it should be tailored specifically to this one use case when there's such a simple and robust alternative.

If this is not what you were suggesting, could you elaborate on this as well please?

@YanceyChiew
Copy link
Contributor

there are the following possible scenarios:

  1. We know the exact path of the music file (e.g. mpris source) -> trivial, we just check for associated lrc file. Doesn't matter if the path is below the configured folder or not since its an absolute (full) path.
  2. We have a relative path (e.g. mpd). We assume (and hope) that it's relative to the configured directory, so we check if an lrc file exists there.
  3. We have no path information -> fall back to the search-based approach

We can also assume another situation, that is, we obtain an absolute path, but there is no lrc file associated with it in the directory of that path. Instead, there is a separate directory for these lrc files.

At this time, it will skip the processing of part 2, and the check of os.Stat will fail because the assumed absUri is invalid, and finally fall back to part 3 to perform comparison based on title keywords.

My suggestion here is that if the lrc path estimated based on the absolute path does not exist, the path should be downgraded to basename so that it has the opportunity to participate in the processing of part 2.

The scenario where we have a full path that happens to be in the user-specified directory, then falls under scenario 1, as modeled in the if-else statement. Am I missing something here?

You seem to assume that the user sets the lyrics directory to be the same as the song directory, which is sometimes the case, but it can also happen that there are multiple different song directories, and a separate lyrics directory.

Now that you've written the function to exactly match lyrics files based on the path, which is great, why not also cover this exact matching to the independent lyrics directory?


Do you mean you'd prefer to compare the uri derived from track.Uri with the index to check if the file exists, rather than to call os.Stat()?

It's just a reminder that there is reusable data to reduce interaction with the file system. If you don't like it, please don't mind. 😁

@sanderr
Copy link
Author

sanderr commented Aug 28, 2024

Thank you for the clarification. I think I understand what you mean now. I believe I was not making my intentions (or assumptions sufficiently clear). My approach was to play it safe in case of uncertainty. I was not making any assumptions about the music and lyric directories, only accepting the reality that we can't know.

As I saw it, when we have an absolute path, we have no choice but to and hope that the lyrics are at that same path, because we have no basis to reason on where to cut it off. e.g. for a music file at ~/music/my-favorite-artist/an-album/track.mp3 and a configured lyric directory of ~/lyrics, should we try ~/lyrics/track.mp3, ~/lyrics/an-album/track.mp3, ...?

However, if I understand correctly, you're saying let's try ~/lyrics/track.mp3 because if we're lucky the user has unambiguous file names and a flat lyrics directory. Is that right?

I still see the current implementation as the defensive approach. To stick to it is exactly what makes it an always-safe override of the hit-or-miss search. This is because the assumptions it makes err on the side of caution. That would no longer be the case if we would try for the basename in the configured lyrics directory.

That said, your proposal has some merit, and my argument against it really only applies to doing it by default. How would you feel about a (well-documented) config setting local.matchFileName: bool to control whether or not we fall back to basename-matching in the root of the lyrics directory? Or perhaps even local.partialMirror: bool (tentative name), where we would exhaustively try ~/lyrics/home/user/music/my-favorite-artist/an-album/track.mp3, ~/lyrics/user/music/my-favorite-artist/an-album/track.mp3, ~/lyrics/music/my-favorite-artist/an-album/track.mp3, ~/lyrics/my-favorite-artist/an-album/track.mp3, ... until (if) we find a file that exists? Or option 3: local.musicRootFolder (perhaps even a list), which would be ~/music/ in the example above. Then we would know to cut off there and try the remaining relative path versus the lyrics directory.

I personally have a preference for option 2 here. It's more powerful than option 1. It's more dynamic than option 3 but it's probably less understandable to the end user, and it may lead to marginally more false positives. But on the other hand, it doesn't require the user to mimic the directory structure of their music directory in their lyric directory, as option 3 does. I do structure my lyric directory like that, but I have no clue if other users do as well, so perhaps best not to make that assumption. Option 2 is more flexible in that regard.

@YanceyChiew
Copy link
Contributor

However, if I understand correctly, you're saying let's try ~/lyrics/track.mp3 because if we're lucky the user has unambiguous file names and a flat lyrics directory. Is that right?

yes that's what i mean.

In my understanding, users usually organize lyrics directories in two ways: directly putting lyrics and songs together, and then specifying the common parent directory of the songs as the lyrics directory, or setting up a separate flat lyrics directory. This is more in line with the usage habits of some well-known players that support lyrics display.

How would you feel about a (well-documented) config setting local.matchFileName: bool to control whether or not we fall back to basename-matching in the root of the lyrics directory?

I'm leaning toward this option. We may have multiple song directories, but only one lyrics directory, which makes it difficult to organize the lyrics directory hierarchy accurately.

Option 2 seems to cover more complex situations, but the method is more radical. Usually if the user organizes his lyrics according to the directory hierarchy, option 3 may be enough.

Consider such a use case: the user's local player includes songs from his own directory ~/music, from the directory /media/<user>/<uuid>/music mounted by udisks, and from the directory that requires root permissions to mount, paths such as /usr/local/share/music, and song directories on the NAS via smb, dlna and webdav shares...[*1]

Such a situation is slightly more complex but entirely possible, and both options 2 and 3 (with a directory list) can solve the problem, but users may not want to organize their own lyrics directory hierarchy for this situation at all.

If it is really necessary, it may be more cost-effective to add another Basename field to c.index for comparison.


[*1] for song directories on the NAS via smb, dlna and webdav shares...

if uri.Scheme != "file" && uri.Scheme != "" {
return ""
}

I suggest here that for URLs whose scheme is not file://, first urldecode it, and then obtain its basename as a local variable, so that it also has the opportunity to participate in matching based on the file name in the lyrics directory.

@sanderr
Copy link
Author

sanderr commented Aug 28, 2024

Thanks, that's valuable input.

Concrete proposal based on the tradeoffs mentioned above:

  • local.matchMusicFilename: bool controls whether we fall back to flat identical filename matching in the lyrics directory (option 1 above + your suggestion in the final part of your latest comment)
  • local.matchPartialMusicPath: bool controls whether we also try path matching (option 2 above + again your suggestion). In practice, the other option is ignored when this one is enabled because it is strictly wider. TLDR: scratch this one, let's keep it simple until we're sure we need it to be more powerful. e.g. for my own library, the lyrics are hierarchical (next to the music files), but I also see no immediate scenario where I'd use a player with an absolute path that's outside of the lyrics directory, so options 2 or 3 would be overkill for me. I do see scenarios with a relative path, e.g. mpd on a remote host, but that's already covered by the PR.
  • if we're adding options to toggle behavioral modes, we might as well add local.searchBestMatch: bool to toggle the final fallback behavior

All options default to true. I'd make sure to add examples to the documentation.

This way we give the user some flexibility to tailor the behavior to their setup, without exposing the full cross product of scenarios to them.

Wdyt? @raitonoberu what's your opinion on all this? Do you think it's a good idea, or should we stick to the current implementatio for this PR and defer the rest?

FYI, I'll have another period (1-2 weeks) of lower availability soon, so it may (or may not) hang for a bit.

@raitonoberu
Copy link
Owner

Hello guys. I have a couple of questions:

  1. Do we really need to support relative paths? And most importantly, relative to what? As I understand it, the user specifies a root directory for MPD to index all the music inside, and MPD sends us paths relative to that directory. Why not ask the user to manually specify that directory in the MPD configuration? (or maybe even ask MPD? is it possible to “query” the root directory?)
  2. Isn't the “flat directory” scenario already covered by the current, fuzzy implementation? I honestly don't see why we have to implement pretty much the same thing (?)
  3. (My only concern with the code) In what cases do we expect *player.TrackMetadata in the lyrics provider to be nil? Maybe we should pass it by value instead (non-nullable)?

Plus a couple of comments:

  • I think it's fine to use os.Stat(), since we request lyrics quite rarely.
  • It would be great if you could add the same functionality for mopidy :)

@sanderr
Copy link
Author

sanderr commented Aug 29, 2024

  1. I think that could be acceptable, though mpd might be running on a remote host, in which case that feels a bit out of place so overall I prefer to configure it in the local source. For my personal use case (which I realize doesn't have to be supported if it's not commonplace), I have an exact mirror of my music collection both on my media center and on my laptop. When mpd is playing in the living room, I'd like to be able to use my local lyrics, so mpd's relative path would be applied relative to my configured lyrics directory. I'd love for this part to remain as is in the PR, but I understand if you see it differently.
  2. The fuzzy search is quite limited at the moment. Iirc it simply counts words. Words like "the" get the same weight as actually relevant ones, and short titles are difficult to match. It also makes heavy assumptions about naming, e.g. if you name files '{album} - {title}' it doesn't work very well. For my library, it more often than not picks an incorrect file. The exact filenam / partial path matching are more flexible, as long as the user has the same naming convention for lyric as for music files.
  3. Good point. That's more my experience with Go than it is a concious decision. Will fix.
  4. I'll add mopidy support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants