Currently, we provide two ways to implement the embedding service:
-
Build the embedding model locally from the server, which is faster, but takes up memory on the local server.
-
Build it based on the TEI endpoint, which provides more flexibility, but may bring some network latency.
For both of the implementations, you need to install requirements first.
pip install -r requirements.txt
You can select one of following ways to start the embedding service:
First, you need to start a TEI service.
your_port=8090
model="BAAI/bge-large-en-v1.5"
docker run -p $your_port:80 -v ./data:/data --name tei_server -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model
Then you need to test your TEI service using the following commands:
curl localhost:$your_port/embed \
-X POST \
-d '{"inputs":"What is Deep Learning?"}' \
-H 'Content-Type: application/json'
Start the embedding service with the TEI_EMBEDDING_ENDPOINT.
export TEI_EMBEDDING_ENDPOINT="http://localhost:$yourport"
export TEI_EMBEDDING_MODEL_NAME="BAAI/bge-large-en-v1.5"
python embedding_tei.py
python local_embedding.py
First, you need to start a TEI service.
your_port=8090
model="BAAI/bge-large-en-v1.5"
docker run -p $your_port:80 -v ./data:/data --name tei_server -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 --model-id $model
Then you need to test your TEI service using the following commands:
curl localhost:$your_port/embed \
-X POST \
-d '{"inputs":"What is Deep Learning?"}' \
-H 'Content-Type: application/json'
Export the TEI_EMBEDDING_ENDPOINT
for later usage:
export TEI_EMBEDDING_ENDPOINT="http://localhost:$yourport"
export TEI_EMBEDDING_MODEL_NAME="BAAI/bge-large-en-v1.5"
cd ../../../../
docker build -t opea/embedding-tei-llama-index:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/tei/llama_index/Dockerfile .
docker run -d --name="embedding-tei-llama-index-server" -p 6000:6000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT -e TEI_EMBEDDING_MODEL_NAME=$TEI_EMBEDDING_MODEL_NAME opea/embedding-tei-llama-index:latest
cd docker
docker compose -f docker_compose_embedding.yaml up -d
curl http://localhost:6000/v1/health_check\
-X GET \
-H 'Content-Type: application/json'
curl http://localhost:6000/v1/embeddings\
-X POST \
-d '{"text":"Hello, world!"}' \
-H 'Content-Type: application/json'