spark nlp ner XlmRoBertaForTokenClassification performance improvement #13475
LucaPifferettiPrivate
started this conversation in
General
Replies: 1 comment 8 replies
-
Hi, So I would go like this:
This Webinar is about the exact same thing: https://www.johnsnowlabs.com/watch-webinar-speed-optimization-benchmarks-in-spark-nlp-3-making-the-most-of-modern-hardware/ |
Beta Was this translation helpful? Give feedback.
8 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone!
I'm using a NER model XlmRoBertaForTokenClassification to find person name inside a column of messages.
The problem is the model is really slow and it takes 35 minutes to process 100K messages.
I have this configuration:
spark driver cores = 2
spark driver memory = 48Gb
spark executors = 8
spark executors cores = 8
spark executores memory = 32Gb
Given a look to the spark UI I have found that during a stage involving the ner model I have a single task that takes 30 minutes, so to improve performance I would need to use all the executors, but it seems a problem related to the model.
Did anyone have the same problem?
Beta Was this translation helpful? Give feedback.
All reactions