-
Notifications
You must be signed in to change notification settings - Fork 281
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for reducing across individual dimensions for 2D matrices…
… using the sum Triton kernel (#2295) Summary: Pull Request resolved: #2295 Support reducing a 2-dimensional matrix across one dimension, where the `BLOCK_SIZE` in the reduced dimension is larger than the dimension size. This kernel performs a simplified reduction which assumes that the entire reduction dimension of the tensor fits in a thread block. The implementation handles toggling between block sizes for the `M` and `N` dimensions depending on the reduction dimension. For example, this kernel will reduce across the 0-th dimension for a (M, N) = (16, 16) matrix where `BLOCK_SIZE_M >= 16` and `BLOCK_SIZE_N` is autotuned. Add a `best_config` metric to find the best `BLOCK_SIZE` for the non-reduction dimension and `num_warps` given some input size. Reviewed By: jbschlosser Differential Revision: D58261858 fbshipit-source-id: 8995c91c54e9792b52f4608446e8e940027a604d
- Loading branch information
1 parent
c13df57
commit 3ecaae9
Showing
2 changed files
with
167 additions
and
36 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters