You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llama.cpp recently added support for an AArch64 specific type of GGUF and AArch64 specific matmul kernels. Here is the merged PR ggerganov/llama.cpp#5780 (review)
Namely Q4_0_8_8, Q4_0_4_8 and more generic Q4_0_4_4 GGUF model formats.
The text was updated successfully, but these errors were encountered:
smpurkis
changed the title
Add kernel support for AArch64 specific GGUF files
Add kernel support for AArch64 specific GGUF files, i.e. Q4_0_*_*
Sep 27, 2024
@EricLBuehler I looked through the code and saw Candle is used for quantized tensors, so I've started looking/work on adding the datatype to Candle. huggingface/candle#2605
Could do with some guidance if that is the right place to add it?
Hello,
llama.cpp recently added support for an AArch64 specific type of GGUF and AArch64 specific matmul kernels. Here is the merged PR ggerganov/llama.cpp#5780 (review)
Namely Q4_0_8_8, Q4_0_4_8 and more generic Q4_0_4_4 GGUF model formats.
The text was updated successfully, but these errors were encountered: