INT8/Quantization in torch-TensorRT 1.4 #2086
chichun-charlie-liu
started this conversation in
General
Replies: 1 comment
-
We are planning to improve and maintain the same level (with Torchscript) of INT8 support in dynamo as well. This is scheduled for next release. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
It appears that INT8 is not ready in the newly released torch-TRT 1.4, as the new dynamo.compile() checks the precision and rejects anything other than FP32 and FP16. But digging into deeper level, there seems to be some INT8/quantization components, similar to those from ver1.3?
I'm just curious if you could elaborate a little on the INT8 implementation plan or status, and if possible, any schedule to release a newer version that enables INT8?
Thanks a lot!
Beta Was this translation helpful? Give feedback.
All reactions