Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Morphosyntax: avoid requesting unavailable MWT models #10

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

antecedent
Copy link

Hi!

I tried the morphotag pipeline on Lithuanian (lit or lt) and bumped into an instance of Stanza's UnsupportedProcessorError that said the following:

Processor mwt is not known for language lt. If you have created your own model, please specify the mwt_model_path parameter when creating the pipeline.

Given that it is certainly not just Lithuanian that will trigger this error with no justifiable reason, could I propose checking with Stanza's resources.json instead of manually maintaining a list of all languages without a MWT model?

Note: it is not really "instead of" but "alongside" the manually curated list for the purposes of this PR, because I could not safely assume that all of MWT models that had been deemed unnecessary here were actually absent from Stanza's standard repository. Perhaps there had been unrelated reasons to exclude them?

@Jemoka
Copy link
Member

Jemoka commented Nov 2, 2024

Thanks! This is super helpful. Will merge in a few days after testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants