Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build without flash attn? #1145

Open
sipie800 opened this issue Nov 7, 2024 · 1 comment
Open

build without flash attn? #1145

sipie800 opened this issue Nov 7, 2024 · 1 comment

Comments

@sipie800
Copy link

sipie800 commented Nov 7, 2024

I'm building from source. I have flash attn installed before, which takes long time to build. At installing xformers, I notice it's building flash attn again and long time again, can I just use without flash attn ?
Is that the feature in newer version? if I don't want the flash attn in xformers, which version should I roll back to ?

Nowadays for AI, people are testing tons of different libraries or repo everyday rapidly. Please take consider of that, we can't afford heavy building time in daily work. I believe that's for industry deployment but xformers is meant for research right? Please avoid heavy building as much as possible.

@lw
Copy link
Contributor

lw commented Nov 7, 2024

We have some logic here to avoid building FlashAttn2 if an existing compatible copy is provided by PyTorch. However we always unconditionally build FlashAttn3.

If you could debug that logic and see why it doesn't work for you we'll happily fix it!

bertmaher pushed a commit to bertmaher/xformers that referenced this issue Dec 20, 2024
* Add torch compile support for BlockDiagonalMask

* Update AttentionBias.to method to match torch.Tensor

* Comment from bottler

* Construct biases by default on CUDA device, do not convert to CUDA when calling mem-eff - error instead

* Update decoder.py and rope_padded

* Fix mypy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants