Embedding model support with openai spec #305

riyajatar37003 · 2024-09-30T07:28:29Z

Hi,
can i do the own custom embedding model deployment with litserve.? any document on this

grumpyp · 2024-09-30T08:15:41Z

There's a great tutorial on how to do this from one of the maintainers @aniketmaurya

https://lightning.ai/lightning-ai/studios/deploy-text-embedding-api-with-litserve

You'd basically just wrap the model in the setup method and could use it in the predict method,..

riyajatar37003 · 2024-09-30T08:26:37Z

its not opena ai compatible thats what i am looking

riyajatar37003 · 2024-09-30T08:28:18Z

`from sentence_transformers import SentenceTransformer
import litserve as ls
class EmbeddingAPI(ls.LitAPI):
def setup(self, device):
self.instruction = "Represent this sentence for searching relevant passages: "
self.model = SentenceTransformer('BAAI/bge-large-en-v1.5', device=device)

def decode_request(self, request):
    return request["input"]

def predict(self, query):
    return self.model.encode([self.instruction + query], normalize_embeddings=True)

def encode_response(self, output):
    return {"embedding": output[0].tolist()}

if name == "main":
api = EmbeddingAPI()
server = ls.LitServer(api,api_path='/embeddings')
server.run(port=8000)
`

import litellm
import os

response = litellm.embedding(
model="openai/mymodel", # add openai/ prefix to model so litellm knows to route to OpenAI
api_key="No-key", # api key to your openai compatible endpoint
api_base="http://127.0.0.1:8000", # set API Base of your Custom OpenAI Endpoint
input="good morning from litellm"
)

print(response)

bhimrazy · 2024-09-30T09:18:58Z

Hi @riyajatar37003, this studio might be helpful for your use case: https://lightning.ai/bhimrajyadav/studios/deploy-openai-like-embedding-api-with-litserve-on-studios.

It also includes additional features like support for different models. Feel free to modify it to suit your needs.

riyajatar37003 · 2024-09-30T09:32:39Z

thanks that part is clear , but i am tryng to use it with litellm proxy server in floowing way

import litellm
import os

response = litellm.embedding(
model="openai/mymodel", # add openai/ prefix to model so litellm knows to route to OpenAI
api_key="No-key", # api key to your openai compatible endpoint
api_base="http://127.0.0.1:8000/", # set API Base of your Custom OpenAI Endpoint

but getting error
NotFoundError: litellm.NotFoundError: NotFoundError: OpenAIException - Error code: 404 - {'detail': 'Not Found'}

aniketmaurya · 2024-09-30T11:12:42Z

@riyajatar37003 currently LitServe doesn't have inbuilt way to serve OpenAI compatible Embedding model. It can be implemented using the OpenAISpec class.

Would love to see a contribution if you are interested 😄

Demirrr · 2024-10-01T07:55:48Z

Stating that Litserve has the Open AI compatibility is an overstatement isn't it ? @aniketmaurya

https://github.com/Lightning-AI/LitServe?tab=readme-ov-file#features
https://lightning.ai/docs/litserve/features/open-ai-spec#openai-api

vllm is a perfect example for the Open AI compatibility
https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html

riyajatar37003 · 2024-10-01T08:10:38Z

Ya but does it support embedding Get Outlook for Android<https://aka.ms/AAb9ysg>

…

________________________________ From: Caglar Demir ***@***.***> Sent: Tuesday, October 1, 2024 1:26:11 PM To: Lightning-AI/LitServe ***@***.***> Cc: Riyaj Atar ***@***.***>; Mention ***@***.***> Subject: Re: [Lightning-AI/LitServe] Embedding model support with openai spec (Issue #305) [External Email]

________________________________ Stating that Litserve has the Open AI compatibility is an overstatement isn't it ? @aniketmaurya<https://github.com/aniketmaurya> https://github.com/Lightning-AI/LitServe?tab=readme-ov-file#features https://lightning.ai/docs/litserve/features/open-ai-spec#openai-api<https://lightning.ai/docs/litserve/features/open-ai-spec#openai-api> vllm is a perfect example for the Open AI compatibility https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html<https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html> — Reply to this email directly, view it on GitHub<#305 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BIJR5XVWSSEHBRMBHCLPEILZZJIRXAVCNFSM6AAAAABPCUNUB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBVGA2TEMZZGQ>. You are receiving this because you were mentioned.Message ID: ***@***.***>

aniketmaurya · 2024-10-01T08:18:47Z

@riyajatar37003 no, it doesn't - #305 (comment)

aniketmaurya · 2024-10-01T08:20:53Z

Stating that Litserve has the Open AI compatibility is an overstatement isn't it ? @aniketmaurya

Lightning-AI/LitServe#features lightning.ai/docs/litserve/features/open-ai-spec#openai-api

vllm is a perfect example for the Open AI compatibility docs.vllm.ai/en/latest/serving/openai_compatible_server.html

@Demirrr don't see anything that can be done with vLLM and not with LitServe OpenAISpec. Maybe I might be missing something. What are you trying to do with LitServe and unable to do?

Demirrr · 2024-10-01T08:52:34Z

vLLM supports embedding models and LitServe currently does not support as you have also pointed out it.
vLLM supports guided_choice option and few more usefull options (https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-completions-api), e.g., the following computation cannot be carried out in LitServe.

completion = client.chat.completions.create(
  model="NousResearch/Meta-Llama-3-8B-Instruct",
  messages=[
    {"role": "user", "content": "Classify this sentiment: vLLM is wonderful!"}
  ],
  extra_body={
    "guided_choice": ["positive", "negative"]
  }
)

Autoscalling, Multi-machine inference, and Authentication features are free in vllm

Currently, I am unable to see the advantages of using LitServe over vllm. Yet, please correct me if if any of the above written points are wrong.

riyajatar37003 · 2024-10-01T08:55:20Z

Where do you see that vllm supports embedding models? From: Caglar Demir ***@***.***> Date: Tuesday, 1 October 2024 at 2:23 PM To: Lightning-AI/LitServe ***@***.***> Cc: Riyaj Atar ***@***.***>, Mention ***@***.***> Subject: Re: [Lightning-AI/LitServe] Embedding model support with openai spec (Issue #305) 1. [External Email]

…

________________________________ vLLM supports embedding models and LitServe currently does not support as you have also pointed out it. 1. vLLM supports guided_choice option and few more usefull options (https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-completions-api<https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-completions-api>), e.g., the following computation cannot be carried out in LitServe. completion = client.chat.completions.create( model="NousResearch/Meta-Llama-3-8B-Instruct", messages=[ {"role": "user", "content": "Classify this sentiment: vLLM is wonderful!"} ], extra_body={ "guided_choice": ["positive", "negative"] } ) 1. Autoscalling, Multi-machine inference, and Authentication features are free in vllm Currently, I am unable to see the advantages of using LitServe over vllm. Yet, please correct me if if any of the above written points are wrong. — Reply to this email directly, view it on GitHub<#305 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BIJR5XXLWUDVK5M2EZWYKJLZZJPGTAVCNFSM6AAAAABPCUNUB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBVGE4TKNRRGM>. You are receiving this because you were mentioned.Message ID: ***@***.***>

Demirrr · 2024-10-01T08:58:07Z

https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_embedding.py @riyajatar37003

riyajatar37003 · 2024-10-01T09:00:03Z

Its offline , not similar to decoder model. Is my understanding correct? From: Caglar Demir ***@***.***> Date: Tuesday, 1 October 2024 at 2:28 PM To: Lightning-AI/LitServe ***@***.***> Cc: Riyaj Atar ***@***.***>, Mention ***@***.***> Subject: Re: [Lightning-AI/LitServe] Embedding model support with openai spec (Issue #305) [External Email]

…

________________________________ https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_embedding.py @riyajatar37003<https://github.com/riyajatar37003> — Reply to this email directly, view it on GitHub<#305 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BIJR5XSS7UIGCBGBG6IINTTZZJP3NAVCNFSM6AAAAABPCUNUB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBVGIYDSNRSGY>. You are receiving this because you were mentioned.Message ID: ***@***.***>

aniketmaurya · 2024-10-02T15:43:46Z

@Demirrr LitServe is generic model serving library not only for LLMs.

At the moment, it provides OpenAISpec support for chat completion API only. To serve an embedding model with OpenAI API, you can use the Pydantic input request to build a OpenAI compatible format.

Other features like guided choice can also be implemented by customizing the decode_method.

Autoscaling and authentication is free in LitServe too. Please feel free to refer to the docs (lightning.ai/litserve) and let us know if you have any feedback!

riyajatar37003 added the enhancement New feature or request label Sep 30, 2024

aniketmaurya added help wanted Extra attention is needed question Further information is requested labels Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding model support with openai spec #305

Embedding model support with openai spec #305

riyajatar37003 commented Sep 30, 2024

grumpyp commented Sep 30, 2024 •

edited

Loading

riyajatar37003 commented Sep 30, 2024

riyajatar37003 commented Sep 30, 2024

bhimrazy commented Sep 30, 2024

riyajatar37003 commented Sep 30, 2024

aniketmaurya commented Sep 30, 2024

Demirrr commented Oct 1, 2024

riyajatar37003 commented Oct 1, 2024 via email

aniketmaurya commented Oct 1, 2024

aniketmaurya commented Oct 1, 2024 •

edited

Loading

Demirrr commented Oct 1, 2024

riyajatar37003 commented Oct 1, 2024 via email

Demirrr commented Oct 1, 2024 •

edited

Loading

riyajatar37003 commented Oct 1, 2024 via email

aniketmaurya commented Oct 2, 2024

Embedding model support with openai spec #305

Embedding model support with openai spec #305

Comments

riyajatar37003 commented Sep 30, 2024

grumpyp commented Sep 30, 2024 • edited Loading

riyajatar37003 commented Sep 30, 2024

riyajatar37003 commented Sep 30, 2024

print(response)

bhimrazy commented Sep 30, 2024

riyajatar37003 commented Sep 30, 2024

aniketmaurya commented Sep 30, 2024

Demirrr commented Oct 1, 2024

riyajatar37003 commented Oct 1, 2024 via email

aniketmaurya commented Oct 1, 2024

aniketmaurya commented Oct 1, 2024 • edited Loading

Demirrr commented Oct 1, 2024

riyajatar37003 commented Oct 1, 2024 via email

Demirrr commented Oct 1, 2024 • edited Loading

riyajatar37003 commented Oct 1, 2024 via email

aniketmaurya commented Oct 2, 2024

grumpyp commented Sep 30, 2024 •

edited

Loading

aniketmaurya commented Oct 1, 2024 •

edited

Loading

Demirrr commented Oct 1, 2024 •

edited

Loading