Skip to content

simplemma-1.0.0

Compare
Choose a tag to compare
@adbar adbar released this 31 May 10:21
6860df6

Extensive refactoring by @juanjoDiaz:

  • Series of modular classes
  • Different lemmatization strategies available
  • Customization of dictionary loading and handling (DictionaryFactory)
  • LanguageDetector class with extended options
  • See readme and detailed documentation

Breaking changes:

  • The extensive argument is now greedy
  • The langdetect submodule is now language_detector
    from simplemma.langdetect import ...from simplemma.language_detector import ...

Fixes and improvements:

  • is_known() function now restored to its state in v0.9.0 (full dictionary)
  • More languages and better rules (with @juanjoDiaz)
  • Use binary strings in dictionaries to save memory
  • Dictionary sort before compression by @1over137

Documentation:

  • Classes and general doc pages by @juanjoDiaz
  • Section on classes in the readme by @osma