Skip to content

simplemma-0.4.0

Compare
Choose a tag to compare
@adbar adbar released this 19 Oct 16:44
  • new languages: Armenian, Greek, Macedonian, Norwegian (Bokmål), and Polish
  • language data reviewed for: Dutch, Finnish, German, Hungarian, Latin, Russian, and Swedish
  • Urdu removed of language list due to issues with the data
  • add support for Python 3.10 and drop support for Python 3.4
  • improved decomposition and tokenization algorithms