-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intrinsic Support for BF16 Extension #223
Comments
Maybe we can reuse most rvv intrinsics unless we want to generate new Zvfbfmin/Zvfbfwma instructions. I have checked vector BF16 widening multiply-accumulate llvm implementation in AArch64 sve. It uses llvm.aarch64.sve.fmlalb.nxv4f32 for fmlalb and llvm.aarch64.sve.bfmlalb for bfmlalb. Therefore what is certain for RISCV now is to define a new intrinsic for vfwmaccbf16. That is to say, for llvm.riscv.vfwmacc.nxv8f32.nxv8f16, we need to use llvm.riscv.vfwmaccbf16 in a given bf16 format. |
We definitely need to introduce new types like vbfloat16m1_t and corresponding RVV C intrinsic API.
That's LLVM implementation detail which should not specified in the intrinsic API spec. |
But I don't think we need to add bfloat16 type for all the rvv floating-point intrinsics if we define a function to convert bf16 to fp32/fp16. Z(v)fbfmin has corresponding instructions. |
At least we should define intrinsic for convert instruction, and define _riscv_vfwmaccbf16[vv|vf]_bf16* for zvfbfwma, also some type utils functions like reinterpret. |
So I'll add bf16-format instrinsics for the following functions with float16 type.
As for __riscv_vfwcvt_f_f_v_f32 and __riscv_vfncvt_f_f_w_f16, I prefer to use a new format according to vfwcvtbf16.f.f.vand vfncvtbf16.f.f.w in the new Zvfbfmin Extension, so I didn't include them. |
Would it make sense to introduce BF16 load/store intrinsics that do a 16-bit integer load followed by a reinterpret cast? This would simplify the user interface considerably. |
Thank you for your suggestion. I'll add load/store intrinsics in my PR. |
The BF16 Extension has recently been proposed with three extra Instruction Set Extensions.
https://github.com/riscv/riscv-bfloat16
I'm wondering how we plan to address existing rvv intrinsics. Do we need to add new intrinsics tailored for bf16 datatype? If so, do we need to give a BF16 version for all the intrinsics with floating-point types? Maybe we can raise an issue for discussion.
The text was updated successfully, but these errors were encountered: