convert_model command not found #173

pint1022 · 2023-05-23T18:20:55Z

Hello
get docker image 0.6.0. Just tried to run the two demo command:

docker run -it --rm --gpus all
-v $PWD:/project ghcr.io/els-rd/transformer-deploy:0.6.0
bash -c "cd /project &&
convert_model -m "philschmid/MiniLM-L6-H384-uncased-sst2"
--backend tensorrt onnx
--seq-len 16 128 128"
got the error: convert_model not found
tried the 2nd command to use triton inference server. the service is started fine; the query command got the error:
curl -X POST http://localhost:8000/v2/models/transformer_onnx_inference/versions/1/infer
--data-binary "@demo/infinity/query_body.bin"
--header "Inference-Header-Content-Length: 161"
{"error":"Request for unknown model: 'transformer_tensorrt_inference' is not found"}

Does infinity download huggingface models and convert to Triton format?

thanks

sc0eur · 2023-07-31T13:23:59Z

I also got the convert_model not found.

I think pip3 install ".[GPU]" ... was lost somewhere in the latest Dockerfile update: link

I was able to run convert_model after manually running pip install ".[GPU]" inside the container

jingzhaoou · 2024-03-12T23:17:35Z

I ran into the same error and fixed it as suggested. That is, manually run pip install ".[GPU]" inside the container.

aidanrussell · 2024-05-02T15:13:10Z

This issue is still there? Can't someone fix it?
EDIT: it seems this repository is not being maintained. It is a shame!

Far0n mentioned this issue Jan 9, 2024

endpoint for embeddings h2oai/h2ogpt#814

Open

Provide feedback