You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello
get docker image 0.6.0. Just tried to run the two demo command:
docker run -it --rm --gpus all
-v $PWD:/project ghcr.io/els-rd/transformer-deploy:0.6.0
bash -c "cd /project &&
convert_model -m "philschmid/MiniLM-L6-H384-uncased-sst2"
--backend tensorrt onnx
--seq-len 16 128 128"
got the error: convert_model not found
tried the 2nd command to use triton inference server. the service is started fine; the query command got the error:
curl -X POST http://localhost:8000/v2/models/transformer_onnx_inference/versions/1/infer
--data-binary "@demo/infinity/query_body.bin"
--header "Inference-Header-Content-Length: 161"
{"error":"Request for unknown model: 'transformer_tensorrt_inference' is not found"}
Does infinity download huggingface models and convert to Triton format?
thanks
The text was updated successfully, but these errors were encountered:
Hello
get docker image 0.6.0. Just tried to run the two demo command:
docker run -it --rm --gpus all
-v $PWD:/project ghcr.io/els-rd/transformer-deploy:0.6.0
bash -c "cd /project &&
convert_model -m "philschmid/MiniLM-L6-H384-uncased-sst2"
--backend tensorrt onnx
--seq-len 16 128 128"
got the error: convert_model not found
tried the 2nd command to use triton inference server. the service is started fine; the query command got the error:
curl -X POST http://localhost:8000/v2/models/transformer_onnx_inference/versions/1/infer
--data-binary "@demo/infinity/query_body.bin"
--header "Inference-Header-Content-Length: 161"
{"error":"Request for unknown model: 'transformer_tensorrt_inference' is not found"}
Does infinity download huggingface models and convert to Triton format?
thanks
The text was updated successfully, but these errors were encountered: