DJL support for embedding models using sentence-transformers #2755

pchamart · 2023-08-17T15:13:42Z

Hi,

Per this doc seems like only the below tasks are supported.

Any plans on including feature-extraction task as well in the future?

I'd be great if we can use text embedding models (both bi and cross encoders) from huggingface for e.g

sentence-transformers/all-MiniLM-L6-v2 (bi-encoder) and
cross-encoders/mmarco-mMiniLMv2-L12-H384-v1 (cross-encoder).

Thanks!

The text was updated successfully, but these errors were encountered:

frankfliu · 2023-08-17T16:09:11Z

@pchamart What you need is text-embedding task

frankfliu · 2023-08-17T16:11:06Z

See this demo: https://github.com/deepjavalibrary/djl-demo/blob/master/huggingface/nlp/src/main/java/com/examples/TextEmbedding.java

pchamart · 2023-08-17T16:28:40Z

Thanks @frankfliu

Using DJL DLC on Amazon SageMaker for inference - in the serving.properties if we specify translatorFactory=TextEmbedding would that suffice.

engine=MPI
translatorFactory=TextEmbedding
option.model_id=sentence-transformers/all-MiniLM-L6-v2
option.trust_remote_code=true
option.tensor_parallel_degree=1
...

Also, can you please confirm if you plan to support cross-encoders as well in the future.
for e.g. cross-encoders/mmarco-mMiniLMv2-L12-H384-v1 (cross-encoder).

frankfliu · 2023-08-17T17:54:44Z

@pchamart

We do have this model in our model zoo, but this is a text-classification model not for TextEmbedding.

If you want to serve sentence-transformers/all-MiniLM-L6-v2 model, we really recommend you to Java engine instead of using python engine, it at least 2X the throughput.

And you can use CPU container for this model as well: https://github.com/aws/deep-learning-containers/blob/master/available_images.md#djl-cpu-full-inference-containers

image_uri=f"763104351884.dkr.ecr.{region}.amazonaws.com/djl-inference:0.23.0-cpu-full"
env = {
    "SERVING_LOAD_MODELS":
    "djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L6-v2"
}
endpoint_name = sagemaker.utils.name_from_base("textembedding")

model = Model(
    image_uri=image_uri,
    env=env,
    role=role,
)

# deploy the endpoint endpoint
model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.xlarge",
    endpoint_name=endpoint_name,
)

frankfliu · 2023-08-23T21:44:31Z

feel free to reopen this issue if you still have questions.

frankfliu closed this as completed Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DJL support for embedding models using sentence-transformers #2755

DJL support for embedding models using sentence-transformers #2755

pchamart commented Aug 17, 2023 •

edited

Loading

frankfliu commented Aug 17, 2023

frankfliu commented Aug 17, 2023

pchamart commented Aug 17, 2023 •

edited

Loading

frankfliu commented Aug 17, 2023

frankfliu commented Aug 23, 2023

DJL support for embedding models using sentence-transformers #2755

DJL support for embedding models using sentence-transformers #2755

Comments

pchamart commented Aug 17, 2023 • edited Loading

frankfliu commented Aug 17, 2023

frankfliu commented Aug 17, 2023

pchamart commented Aug 17, 2023 • edited Loading

frankfliu commented Aug 17, 2023

frankfliu commented Aug 23, 2023

pchamart commented Aug 17, 2023 •

edited

Loading

pchamart commented Aug 17, 2023 •

edited

Loading