John Snow Labs Spark-NLP 3.1.0: Over 2600+ new models and pipelines in 200+ languages, new DistilBERT, RoBERTa, and XLM-RoBERTa transformers, support for external Transformers, and lots more! #5671
maziyarpanahi
announced in
Announcement
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Overview
We are very excited to release Spark NLP 🚀 3.1.0! This is one of our biggest releases with lots of models, pipelines, and groundworks for future features that we are so proud to share it with our community.
Spark NLP 3.1.0 comes with over 2600+ new pretrained models and pipelines in over 200+ languages, new DistilBERT, RoBERTa, and XLM-RoBERTa annotators, support for HuggingFace 🤗 (Autoencoding) models in Spark NLP, and extends support for new Databricks and EMR instances.
As always, we would like to thank our community for their feedback, questions, and feature requests.
Major features and improvements
bert-base-uncased
, runs 60% faster while preserving over 95% of BERT’s performancessaved_model
feature in HuggingFace within a few lines of codes and import any BERT, DistilBERT, RoBERTa, and XLM-RoBERTa models to Spark NLP. We will work on the remaining annotators and extend this support to the rest with each release - For more information please visit this discussionTokenizer
orRegexTokenizer
and generates token pieces, encodes, and decodes the resultsModels and Pipelines
Spark NLP 3.1.0 comes with over 2600+ new pretrained models and pipelines in over 200 languages available for Windows, Linux, and macOS users.
Featured Transformers
nl
de
de
it
it
tr
tr
zh
zh
en
en
xx
en
en
en
xx
xx
Featured Translation Models
xx
xx
xx
xx
xx
xx
xx
xx
xx
xx
xx
xx
xx
Transformers in Spark NLP
Import hundreds of models in different languages to Spark NLP
The complete list of all 3700+ models & pipelines in 200+ languages is available on Models Hub.
Backward compatibility
3.1.x
release. You can either useMarianTransformer.pretrained(MODEL_NAME)
and it will automatically download the compatible model or you can visit Models Hub to download the compatible models for offline use viaMarianTransformer.load(PATH)
Documentation
Installation
Python
#PyPI pip install spark-nlp==3.1.0
Spark Packages
spark-nlp on Apache Spark 3.0.x and 3.1.x (Scala 2.12 only):
GPU
spark-nlp on Apache Spark 2.4.x (Scala 2.11 only):
GPU
spark-nlp on Apache Spark 2.3.x (Scala 2.11 only):
GPU
Maven
spark-nlp on Apache Spark 3.0.x and 3.1.x:
spark-nlp-gpu:
spark-nlp on Apache Spark 2.4.x:
spark-nlp-gpu:
spark-nlp on Apache Spark 2.3.x:
spark-nlp-gpu:
FAT JARs
CPU on Apache Spark 3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-3.1.0.jar
GPU on Apache Spark 3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-3.1.0.jar
CPU on Apache Spark 2.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-spark24-assembly-3.1.0.jar
GPU on Apache Spark 2.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-spark24-assembly-3.1.0.jar
CPU on Apache Spark 2.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-spark23-assembly-3.1.0.jar
GPU on Apache Spark 2.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-spark23-assembly-3.1.0.jar
This discussion was created from the release John Snow Labs Spark-NLP 3.1.0: Over 2600+ new models and pipelines in 200+ languages, new DistilBERT, RoBERTa, and XLM-RoBERTa transformers, support for external Transformers, and lots more!.
Beta Was this translation helpful? Give feedback.
All reactions