Add more information to quantized linear module and added some logs #782

jerryzh168 · 2024-08-30T17:42:42Z

Summary:
Fixes #771

Test Plan:
python test/dtypes/test_affine_quantized_tensor.py -k test_print_quantized_module

Example output:

Linear(in_features=128, out_features=256, weight=AffineQuantizedTensor(shape=torch.Size([256, 128]), block_size=(1, 128), device=cuda:0, layout_type=PlainLayoutType(), layout_tensor_dtype=torch.int8, quant_min=None, quant_max=None))
.Linear(in_features=128, out_features=256, weight=LinearActivationQuantizedTensor(activation=<function _int8_asymm_per_token_quant at 0x7feb1d146820>, weight=AffineQuantizedTensor(shape=torch.Size([256, 128]), block_size=(1, 32), device=cuda:0, layout_type=PlainLayoutType(), layout_tensor_dtype=torch.int8, quant_min=-8, quant_max=7)))
.Linear(in_features=128, out_features=256, weight=LinearActivationQuantizedTensor(activation=<function _int8_symm_per_token_reduced_range_quant at 0x7feb1d146af0>, weight=AffineQuantizedTensor(shape=torch.Size([256, 128]), block_size=(1, 128), device=cuda:0, layout_type=PlainLayoutType(), layout_tensor_dtype=torch.int8, quant_min=None, quant_max=None)))
.Linear(in_features=128, out_features=256, weight=AffineQuantizedTensor(shape=torch.Size([256, 128]), block_size=(1, 32), device=cuda:0, layout_type=TensorCoreTiledLayoutType(inner_k_tiles=8), layout_tensor_dtype=torch.int32, quant_min=0, quant_max=15))
.Linear(in_features=128, out_features=256, weight=LinearActivationQuantizedTensor(activation=<function _int8_symm_per_token_reduced_range_quant at 0x7feb1d146af0>, weight=AffineQuantizedTensor(shape=torch.Size([256, 128]), block_size=(1, 128), device=cuda:0, layout_type=SemiSparseLayoutType(), layout_tensor_dtype=torch.int8, quant_min=None, quant_max=None)))

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2024-08-30T17:42:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/782

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f7e0bbd with merge base 6080986 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo · 2024-09-03T21:54:59Z

torchao/quantization/quant_api.py

    def insert_subclass(lin):
-        lin.weight = torch.nn.Parameter(constructor(lin.weight), requires_grad=False)
+        lin.weight = torch.nn.Parameter(constructor(lin.weight, **kwargs), requires_grad=False)
+        lin.extra_repr = types.MethodType(_linear_extra_repr, lin)


optional: is there anything we can do to preserve any preexisting custom repr?

actually there will be a recursive call here, since we are overriding extra_repr()

vkuzo

looks great, thanks for working on this!

Summary: Fixes pytorch#771 Test Plan: python test/dtypes/test_affine_quantized_tensor.py -k test_print_quantized_module Reviewers: Subscribers: Tasks: Tags:

…ytorch#782) * Add more information to quantized linear module and added some logs Summary: Fixes pytorch#771 Test Plan: python test/dtypes/test_affine_quantized_tensor.py -k test_print_quantized_module Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 30, 2024

jerryzh168 requested review from msaroufim, HDCharles and vkuzo August 30, 2024 17:42

jerryzh168 force-pushed the print-quantized-module branch from e66c44e to bd95206 Compare August 30, 2024 17:46

vkuzo reviewed Sep 3, 2024

View reviewed changes

vkuzo approved these changes Sep 3, 2024

View reviewed changes

jerryzh168 added 4 commits September 3, 2024 16:19

Add more information to quantized linear module and added some logs

fa7b035

Summary: Fixes pytorch#771 Test Plan: python test/dtypes/test_affine_quantized_tensor.py -k test_print_quantized_module Reviewers: Subscribers: Tasks: Tags:

rebase

72df5dc

just print the metadata for quantized tensors

e30858d

more docs for aqt and quant primitive ops

e3f6f7c

jerryzh168 force-pushed the print-quantized-module branch from e8219a8 to e3f6f7c Compare September 3, 2024 23:19

skip non-cuda

f7e0bbd

jerryzh168 merged commit 0987dd6 into pytorch:main Sep 4, 2024
17 checks passed

jerryzh168 deleted the print-quantized-module branch September 4, 2024 00:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more information to quantized linear module and added some logs #782

Add more information to quantized linear module and added some logs #782

jerryzh168 commented Aug 30, 2024 •

edited

Loading

pytorch-bot bot commented Aug 30, 2024 •

edited

Loading

vkuzo Sep 3, 2024

jerryzh168 Sep 3, 2024 •

edited

Loading

vkuzo left a comment

Add more information to quantized linear module and added some logs #782

Add more information to quantized linear module and added some logs #782

Conversation

jerryzh168 commented Aug 30, 2024 • edited Loading

pytorch-bot bot commented Aug 30, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/782

✅ No Failures

vkuzo Sep 3, 2024

Choose a reason for hiding this comment

jerryzh168 Sep 3, 2024 • edited Loading

Choose a reason for hiding this comment

vkuzo left a comment

Choose a reason for hiding this comment

jerryzh168 commented Aug 30, 2024 •

edited

Loading

pytorch-bot bot commented Aug 30, 2024 •

edited

Loading

jerryzh168 Sep 3, 2024 •

edited

Loading