INT8/Quantization in torch-TensorRT 1.4 #2086

chichun-charlie-liu · 2023-07-07T20:21:31Z

chichun-charlie-liu
Jul 7, 2023

Hello,

It appears that INT8 is not ready in the newly released torch-TRT 1.4, as the new dynamo.compile() checks the precision and rejects anything other than FP32 and FP16. But digging into deeper level, there seems to be some INT8/quantization components, similar to those from ver1.3?

I'm just curious if you could elaborate a little on the INT8 implementation plan or status, and if possible, any schedule to release a newer version that enables INT8?

Thanks a lot!

peri044 · 2023-07-10T20:33:43Z

peri044
Jul 10, 2023
Collaborator

We are planning to improve and maintain the same level (with Torchscript) of INT8 support in dynamo as well. This is scheduled for next release.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INT8/Quantization in torch-TensorRT 1.4 #2086

{{title}}

Replies: 1 comment

{{title}}

Select a reply

INT8/Quantization in torch-TensorRT 1.4 #2086

chichun-charlie-liu Jul 7, 2023

Replies: 1 comment

peri044 Jul 10, 2023 Collaborator

chichun-charlie-liu
Jul 7, 2023

peri044
Jul 10, 2023
Collaborator