simplemma-1.0.0
Extensive refactoring by @juanjoDiaz:
- Series of modular classes
- Different lemmatization strategies available
- Customization of dictionary loading and handling (
DictionaryFactory
) LanguageDetector
class with extended options- See readme and detailed documentation
Breaking changes:
- The
extensive
argument is nowgreedy
- The
langdetect
submodule is nowlanguage_detector
from simplemma.langdetect import ...
→from simplemma.language_detector import ...
Fixes and improvements:
is_known()
function now restored to its state in v0.9.0 (full dictionary)- More languages and better rules (with @juanjoDiaz)
- Use binary strings in dictionaries to save memory
- Dictionary sort before compression by @1over137
Documentation:
- Classes and general doc pages by @juanjoDiaz
- Section on classes in the readme by @osma