Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RISC-V] Drop undesirable two instruction macc alternatives
So I was looking at sub_dct a little while ago and was surprised to see us emit two instructions out of a single pattern. We generally try to avoid that -- it's not always possible, but as a general rule of thumb it should be avoided. Specifically I saw: > vmv1r.v v4,v2 # 138 [c=4 l=4] *pred_mul_plusrvvm1hi_undef/5 > vmacc.vv v4,v8,v1 When we emit multiple instructions out of a single pattern we can't build a good schedule as we can't really describe the two instructions well and we can't split them up -- they move as an atomic unit. These cases can also raise correctness issues if the pattern doesn't properly account for both instructions in its length computation. Note the length, 4 bytes. So this is both a performance and latent correctness issue. It appears that these alternatives are meant to deal with the case when we have three source inputs and a non-matching output. The author did put in "?" to slightly disparage these alternatives, but a "!" would have been better. The best solution is to just remove those alternatives and let the allocator manage the matching operand issue. That's precisely what this patch does. For the various integer multiply-add/multiply-accumulate patterns we drop the alternatives which don't require a match between the output and one of the inputs. That fixes the correctness issue and should shave a cycle or two off our sub_dct code. Essentially the move bubbles up into an empty slot and we can schedule around the vmacc sensibly. Interestingly enough this fixes a scan-assembler test in my tester for both rv32 and rv64. > Tests that now work, but didn't before (10 tests): > > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 > unix/-march=rv32gcv: gcc: gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c scan-assembler-times \\tvmacc\\.vv 8 My BPI is already in a bootstrap test, so this patch won't hit the BPI for bootstrapping until Wednesday, meaning no data until Thursday. Will wait for the pre-commit tester though. gcc/ * config/riscv/vector.md (pred_mul_plus<mode>_undef): Drop alternatives where output doesn't have to match input. (pred_madd<mode>, pred_macc<mode>): Likewise. (pred_madd<mode>_scalar, pred_macc<mode>_scalar): Likewise. (pred_madd<mode>_exended_scalar): Likewise. (pred_macc<mode>_exended_scalar): Likewise. (pred_minus_mul<mode>_undef): Likewise. (pred_nmsub<mode>, pred_nmsac<mode>): Likewise. (pred_nmsub<mode>_scalar, pred_nmsac<mode>_scalar): Likewise. (pred_nmsub<mode>_exended_scalar): Likewise. (pred_nmsac<mode>_exended_scalar): Likewise.
- Loading branch information