Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to host .onnx model on triton server #179

Open
riyaj8888 opened this issue Jul 19, 2023 · 0 comments
Open

unable to host .onnx model on triton server #179

riyaj8888 opened this issue Jul 19, 2023 · 0 comments

Comments

@riyaj8888
Copy link

i am converting .pt model to .onnx by loading model to cuda:0 device and hosting it on cuda:1 by setting gpus:[1] inside config.pbtxt file but i am getting following error.
onnx runtime error 6: Non-zero status code returned while running Einsum node. Name:\'/model/layer.0/rel_attn/Einsum_8\' Status Message: /workspace/onnxruntime/onnxruntime/core/providers/cpu/math/einsum_utils/einsum_auxiliary_ops.cc:298 std::unique_ptr<onnxruntime::Tensor> onnxruntime::EinsumOp::Transpose(const onnxruntime::Tensor&, const onnxruntime::TensorShape&, const gsl::span<const long unsigned int>&, onnxruntime::AllocatorPtr, void*, const Transpose&) 21Einsum op: Transpose failed: CUDA failure 1: invalid argument ; GPU=1 ; hostname=2a71d799b143 ; expr=cudaMemcpyAsync(output.MutableDataRaw(), input.DataRaw(), input.Shape().Size() * input.DataType()->Size(), cudaMemcpyDeviceToDevice, stream); \\n"}'

but when i set the gpus:[0] , it runs smoothly without any error. is this normal behavior?
why it happened at the first place ?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant