You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm building from source. I have flash attn installed before, which takes long time to build. At installing xformers, I notice it's building flash attn again and long time again, can I just use without flash attn ?
Is that the feature in newer version? if I don't want the flash attn in xformers, which version should I roll back to ?
Nowadays for AI, people are testing tons of different libraries or repo everyday rapidly. Please take consider of that, we can't afford heavy building time in daily work. I believe that's for industry deployment but xformers is meant for research right? Please avoid heavy building as much as possible.
The text was updated successfully, but these errors were encountered:
We have some logic here to avoid building FlashAttn2 if an existing compatible copy is provided by PyTorch. However we always unconditionally build FlashAttn3.
If you could debug that logic and see why it doesn't work for you we'll happily fix it!
bertmaher
pushed a commit
to bertmaher/xformers
that referenced
this issue
Dec 20, 2024
* Add torch compile support for BlockDiagonalMask
* Update AttentionBias.to method to match torch.Tensor
* Comment from bottler
* Construct biases by default on CUDA device, do not convert to CUDA when calling mem-eff - error instead
* Update decoder.py and rope_padded
* Fix mypy
I'm building from source. I have flash attn installed before, which takes long time to build. At installing xformers, I notice it's building flash attn again and long time again, can I just use without flash attn ?
Is that the feature in newer version? if I don't want the flash attn in xformers, which version should I roll back to ?
Nowadays for AI, people are testing tons of different libraries or repo everyday rapidly. Please take consider of that, we can't afford heavy building time in daily work. I believe that's for industry deployment but xformers is meant for research right? Please avoid heavy building as much as possible.
The text was updated successfully, but these errors were encountered: