My fork of Mallet, MAchine Learning for LanguagE Toolkit in java
Updated to last version: 2.0.7, 2011-9-22
Original mercurial repository's history kept using hg-git
-
Improved documentation
-
Pipes to get features from single tokens through transformations. See package cc.mallet.pipe.tsf.transform
For example (TokenTransform) to convert to lower case or to convert to different morphologies, like U27k9 -> A11a1 (cap num num low num)