-
Notifications
You must be signed in to change notification settings - Fork 53
Pull requests: NVIDIA/Fuser
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Split Hopper MMA by warp-tile before instruction tile
#3642
opened Dec 24, 2024 by
jacobhinkle
•
Draft
Ring Allgather + GEMM Overlap HostIR Implementation
Multi-GPU
#3626
opened Dec 20, 2024 by
nsarka
Loading…
cacheInputs propagates allocation only for matmul schedulers.
#3621
opened Dec 19, 2024 by
wujingyue
Loading…
Add support for smem_epilogue when mma output is not cast to half
#3620
opened Dec 19, 2024 by
protonu
Loading…
Support outer reduction scheduler with SOL autotuning
Autotune
Generate heuristics through machine learning models.
Lower distributed matmul to pipelined algorithm for fine-grained overlap
Multi-GPU
#3606
opened Dec 18, 2024 by
samnordmann
Loading…
2 tasks done
[wgmma] Insert commit_group and wait_group after mma_async
Matmuls
#3573
opened Dec 11, 2024 by
jacobhinkle
•
Draft
Previous Next
ProTip!
no:milestone will show everything without a milestone.