Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedding model support with openai spec #305

Open
riyajatar37003 opened this issue Sep 30, 2024 · 15 comments
Open

Embedding model support with openai spec #305

riyajatar37003 opened this issue Sep 30, 2024 · 15 comments
Labels
enhancement New feature or request help wanted Extra attention is needed question Further information is requested

Comments

@riyajatar37003
Copy link

Hi,
can i do the own custom embedding model deployment with litserve.? any document on this

@riyajatar37003 riyajatar37003 added the enhancement New feature or request label Sep 30, 2024
@grumpyp
Copy link
Contributor

grumpyp commented Sep 30, 2024

There's a great tutorial on how to do this from one of the maintainers @aniketmaurya

https://lightning.ai/lightning-ai/studios/deploy-text-embedding-api-with-litserve

You'd basically just wrap the model in the setup method and could use it in the predict method,..

@riyajatar37003
Copy link
Author

its not opena ai compatible thats what i am looking

@riyajatar37003
Copy link
Author

`from sentence_transformers import SentenceTransformer
import litserve as ls
class EmbeddingAPI(ls.LitAPI):
def setup(self, device):
self.instruction = "Represent this sentence for searching relevant passages: "
self.model = SentenceTransformer('BAAI/bge-large-en-v1.5', device=device)

def decode_request(self, request):
    return request["input"]

def predict(self, query):
    return self.model.encode([self.instruction + query], normalize_embeddings=True)

def encode_response(self, output):
    return {"embedding": output[0].tolist()}

if name == "main":
api = EmbeddingAPI()
server = ls.LitServer(api,api_path='/embeddings')
server.run(port=8000)
`

import litellm
import os

response = litellm.embedding(
model="openai/mymodel", # add openai/ prefix to model so litellm knows to route to OpenAI
api_key="No-key", # api key to your openai compatible endpoint
api_base="http://127.0.0.1:8000", # set API Base of your Custom OpenAI Endpoint
input="good morning from litellm"
)

print(response)

@bhimrazy
Copy link
Contributor

Hi @riyajatar37003, this studio might be helpful for your use case: https://lightning.ai/bhimrajyadav/studios/deploy-openai-like-embedding-api-with-litserve-on-studios.

It also includes additional features like support for different models. Feel free to modify it to suit your needs.

@riyajatar37003
Copy link
Author

thanks that part is clear , but i am tryng to use it with litellm proxy server in floowing way

import litellm
import os

response = litellm.embedding(
model="openai/mymodel", # add openai/ prefix to model so litellm knows to route to OpenAI
api_key="No-key", # api key to your openai compatible endpoint
api_base="http://127.0.0.1:8000/", # set API Base of your Custom OpenAI Endpoint

but getting error
NotFoundError: litellm.NotFoundError: NotFoundError: OpenAIException - Error code: 404 - {'detail': 'Not Found'}

@aniketmaurya
Copy link
Collaborator

@riyajatar37003 currently LitServe doesn't have inbuilt way to serve OpenAI compatible Embedding model. It can be implemented using the OpenAISpec class.

Would love to see a contribution if you are interested 😄

@Demirrr
Copy link

Demirrr commented Oct 1, 2024

Stating that Litserve has the Open AI compatibility is an overstatement isn't it ? @aniketmaurya

https://github.com/Lightning-AI/LitServe?tab=readme-ov-file#features
https://lightning.ai/docs/litserve/features/open-ai-spec#openai-api

vllm is a perfect example for the Open AI compatibility
https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html

@riyajatar37003
Copy link
Author

riyajatar37003 commented Oct 1, 2024 via email

@aniketmaurya
Copy link
Collaborator

@riyajatar37003 no, it doesn't - #305 (comment)

@aniketmaurya
Copy link
Collaborator

aniketmaurya commented Oct 1, 2024

Stating that Litserve has the Open AI compatibility is an overstatement isn't it ? @aniketmaurya

Lightning-AI/LitServe#features lightning.ai/docs/litserve/features/open-ai-spec#openai-api

vllm is a perfect example for the Open AI compatibility docs.vllm.ai/en/latest/serving/openai_compatible_server.html

@Demirrr don't see anything that can be done with vLLM and not with LitServe OpenAISpec. Maybe I might be missing something. What are you trying to do with LitServe and unable to do?

@Demirrr
Copy link

Demirrr commented Oct 1, 2024

  1. vLLM supports embedding models and LitServe currently does not support as you have also pointed out it.
  2. vLLM supports guided_choice option and few more usefull options (https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-completions-api), e.g., the following computation cannot be carried out in LitServe.
completion = client.chat.completions.create(
  model="NousResearch/Meta-Llama-3-8B-Instruct",
  messages=[
    {"role": "user", "content": "Classify this sentiment: vLLM is wonderful!"}
  ],
  extra_body={
    "guided_choice": ["positive", "negative"]
  }
)
  1. Autoscalling, Multi-machine inference, and Authentication features are free in vllm

Currently, I am unable to see the advantages of using LitServe over vllm. Yet, please correct me if if any of the above written points are wrong.

@riyajatar37003
Copy link
Author

riyajatar37003 commented Oct 1, 2024 via email

@Demirrr
Copy link

Demirrr commented Oct 1, 2024

@riyajatar37003
Copy link
Author

riyajatar37003 commented Oct 1, 2024 via email

@aniketmaurya
Copy link
Collaborator

@Demirrr LitServe is generic model serving library not only for LLMs.

At the moment, it provides OpenAISpec support for chat completion API only. To serve an embedding model with OpenAI API, you can use the Pydantic input request to build a OpenAI compatible format.

Other features like guided choice can also be implemented by customizing the decode_method.

Autoscaling and authentication is free in LitServe too. Please feel free to refer to the docs (lightning.ai/litserve) and let us know if you have any feedback!

@aniketmaurya aniketmaurya added help wanted Extra attention is needed question Further information is requested labels Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants