Skip to content

Latest commit

 

History

History
14 lines (11 loc) · 261 Bytes

README.md

File metadata and controls

14 lines (11 loc) · 261 Bytes

Testbed

  • RTX 4090
  • CUDA 12.1
  • CUTLASS 3.4.1
  • cuBLAS 12.01
  • Warm up : 100 times
  • Execution : 100 times
  • DataType: fp32 + fp16

Performance