You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SDPA supports this and it can be faster in some scenarios. As such to keep parity, and iterate faster on speed improvements, it may better to add a backend here via https://github.com/NVIDIA/cudnn-frontend which has a Python API as well for interacting it with it. This also allows folks to try CUDNN attention improvements without waiting for updated binaries to land in PT or update their PT version to the absolute latest one.
🚀 Feature
Motivation
Pitch
Alternatives
Additional context
Flagging @eqy in case they are interested in helpng out.
The text was updated successfully, but these errors were encountered: