-
-
Notifications
You must be signed in to change notification settings - Fork 5
Tag
Wannaphong Phatthiyaphaibun edited this page Feb 4, 2022
·
3 revisions
LaoNLP support
- pos_tag: part-of-speech
pos_tag(
words: List[str],
engine: str = "perceptron",
corpus: str = "SeqLabeling"
)
We support Lao corpus:
-
SeqLabeling
: corpus from https://github.com/FoVNull/SeqLabeling -
yunshan_cup_2020
: corpus from https://github.com/GKLMIP/Yunshan-Cup-2020
You can get train notebook from https://github.com/wannaphong/LaoNLP-Notebook.
Example
from laonlp.tokenize import word_tokenize
from laonlp.tag import pos_tag
sent = word_tokenize("ພາສາລາວໃນປັດຈຸບັນ.")
pos_tag(sent)
# output: [('ພາສາລາວ', 'N'), ('ໃນ', 'PRE'), ('ປັດຈຸບັນ', 'ADJ'), ('.', 'PUNCT')]