build without flash attn? #1145

sipie800 · 2024-11-07T00:29:29Z

I'm building from source. I have flash attn installed before, which takes long time to build. At installing xformers, I notice it's building flash attn again and long time again, can I just use without flash attn ?
Is that the feature in newer version? if I don't want the flash attn in xformers, which version should I roll back to ?

Nowadays for AI, people are testing tons of different libraries or repo everyday rapidly. Please take consider of that, we can't afford heavy building time in daily work. I believe that's for industry deployment but xformers is meant for research right? Please avoid heavy building as much as possible.

lw · 2024-11-07T10:17:17Z

We have some logic here to avoid building FlashAttn2 if an existing compatible copy is provided by PyTorch. However we always unconditionally build FlashAttn3.

If you could debug that logic and see why it doesn't work for you we'll happily fix it!

* Add torch compile support for BlockDiagonalMask * Update AttentionBias.to method to match torch.Tensor * Comment from bottler * Construct biases by default on CUDA device, do not convert to CUDA when calling mem-eff - error instead * Update decoder.py and rope_padded * Fix mypy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build without flash attn? #1145

build without flash attn? #1145

sipie800 commented Nov 7, 2024 •

edited

Loading

lw commented Nov 7, 2024

build without flash attn? #1145

build without flash attn? #1145

Comments

sipie800 commented Nov 7, 2024 • edited Loading

lw commented Nov 7, 2024

sipie800 commented Nov 7, 2024 •

edited

Loading