[AFQ] Optimize tensor_flatten for runtime #1114

IvanKobzarev · 2024-10-18T12:26:34Z

Stack from ghstack (oldest at bottom):

-> [AFQ] Optimize tensor_flatten for runtime #1114

tensor_flatten is called at runtime when inputs are subclasses e.g. AffineQuantizedTensor

In case of using float8_dynamic_activation_float8_weight quantization, the activations will also be AQT.

There was external complain, that unwrap_tensor_subclasses sometimes exceeds in duration that compiled region execution.

Profiling shows that tensor_flatten attribute access will go through torch_function handling (AQT has it).
If to remove this torch_function dispatch for each getattr at runtime - tensor_flatten becomes (in my measurements x9 faster)

[ghstack-poisoned]

ghstack-source-id: f028ae1d0afb3aaea9a4afebe29b114de80b5d9e Pull Request resolved: #1114

pytorch-bot · 2024-10-18T12:26:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1114

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d44ddab with merge base 3475aed ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168

do we need to do the same for TensorImpl tensor subclasses?

[AFQ] Optimize tensor_flatten for runtime

d44ddab

[ghstack-poisoned]

IvanKobzarev added a commit that referenced this pull request Oct 18, 2024

[AFQ] Optimize tensor_flatten for runtime

36d7beb

ghstack-source-id: f028ae1d0afb3aaea9a4afebe29b114de80b5d9e Pull Request resolved: #1114

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 18, 2024

IvanKobzarev requested review from vkuzo and jerryzh168 October 18, 2024 12:31

jerryzh168 approved these changes Oct 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AFQ] Optimize tensor_flatten for runtime #1114

[AFQ] Optimize tensor_flatten for runtime #1114

IvanKobzarev commented Oct 18, 2024 •

edited

Loading

pytorch-bot bot commented Oct 18, 2024 •

edited

Loading

jerryzh168 left a comment

[AFQ] Optimize tensor_flatten for runtime #1114

Are you sure you want to change the base?

[AFQ] Optimize tensor_flatten for runtime #1114

Conversation

IvanKobzarev commented Oct 18, 2024 • edited Loading

pytorch-bot bot commented Oct 18, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1114

✅ No Failures

jerryzh168 left a comment

Choose a reason for hiding this comment

IvanKobzarev commented Oct 18, 2024 •

edited

Loading

pytorch-bot bot commented Oct 18, 2024 •

edited

Loading