Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is "n-best" tagging possible with CRFSuite? #84

Open
wrznr opened this issue Jun 30, 2017 · 3 comments
Open

Question: Is "n-best" tagging possible with CRFSuite? #84

wrznr opened this issue Jun 30, 2017 · 3 comments

Comments

@wrznr
Copy link

wrznr commented Jun 30, 2017

The Wapiti CRF toolkit has a neat feature called N-best Viterbi output which returns the n-best label sequences for an input sequence. Is there a similar functionality in crfsuite?

Thanks for your hints!

@usptact
Copy link

usptact commented Jun 30, 2017

CRFSuite does not support n-best output. The decoder algorithm is Viterbi which appears to not too difficult to make it n-best (especially for short sequences).

Did you manage to get meaningful n-best outputs with Wapiti on your data? I looked at it a while ago and realized that on my data n-best outputs not always make sense (NER).

@ZmeiGorynych
Copy link

How about looking at marginal probabilities for all possible labels in a given position (that functionality exists in the Python wrapper as pycrfsuite.Tagger.marginal() so I presume also in the CFRSuite itself) and picking the best n values?

@usptact
Copy link

usptact commented Jun 12, 2018

@ZmeiGorynych Unfortunately marginals is not enough to compute the n-best sequence taggings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants