[Feature]: tensor parallelism support for bnb quantization (via IBM's fork) #767

BlairSadewitz · 2024-09-28T16:53:51Z

🚀 The feature, motivation and pitch

I don't know if it's feasible or worthwhile to merge this, as maybe the trees are too divergent, etc., but cherry-picking commits for projects I don't fully understand is somehow a pastime for me, so ...

Alternatives

I could always use one of the other 8.4234234*10^23 quantization methods, but, hey, variety is the spice of life--or something.

Additional context

It doesn't work for pre-quantized models. 🎉~

AlpinDale · 2024-09-29T03:33:55Z

Perhaps, I'll have to look into it. bnb hasn't been a priority

BlairSadewitz · 2024-09-29T22:06:33Z

Yeah, I hear you. I'm gonna file a better PR in a second, though, so ... ;-)

AlpinDale · 2024-09-30T09:29:37Z

FYI I'm working on new kernels for massively speeding up bnb quants + add TP support for them. You might want to hold on for now, or help out with that upcoming PR if you're comfortable with CUDA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: tensor parallelism support for bnb quantization (via IBM's fork) #767

[Feature]: tensor parallelism support for bnb quantization (via IBM's fork) #767

BlairSadewitz commented Sep 28, 2024 •

edited

Loading

AlpinDale commented Sep 29, 2024

BlairSadewitz commented Sep 29, 2024

AlpinDale commented Sep 30, 2024

[Feature]: tensor parallelism support for bnb quantization (via IBM's fork) #767

[Feature]: tensor parallelism support for bnb quantization (via IBM's fork) #767

Comments

BlairSadewitz commented Sep 28, 2024 • edited Loading

🚀 The feature, motivation and pitch

Alternatives

Additional context

AlpinDale commented Sep 29, 2024

BlairSadewitz commented Sep 29, 2024

AlpinDale commented Sep 30, 2024

BlairSadewitz commented Sep 28, 2024 •

edited

Loading