summaryrefslogtreecommitdiff
path: root/gcc/internal-fn.h
diff options
context:
space:
mode:
authorRichard Sandiford <richard.sandiford@linaro.org>2018-07-12 13:01:48 +0000
committerRichard Sandiford <rsandifo@gcc.gnu.org>2018-07-12 13:01:48 +0000
commit0936858f081b77319f8f6e5825dc86d2861d0445 (patch)
tree87c7aa8363d38fe0e4022d33f8a25f14b6dc8ff8 /gcc/internal-fn.h
parentb41d1f6ed753bf7ae7e68f745e50c26ee65b5711 (diff)
Support fused multiply-adds in fully-masked reductions
This patch adds support for fusing a conditional add or subtract with a multiplication, so that we can use fused multiply-add and multiply-subtract operations for fully-masked reductions. E.g. for SVE we vectorise: double res = 0.0; for (int i = 0; i < n; ++i) res += x[i] * y[i]; using a fully-masked loop in which the loop body has the form: res_1 = PHI<0(preheader), res_2(latch)>; avec = .MASK_LOAD (loop_mask, a) bvec = .MASK_LOAD (loop_mask, b) prod = avec * bvec; res_2 = .COND_ADD (loop_mask, res_1, prod, res_1); where the last statement does the equivalent of: res_2 = loop_mask ? res_1 + prod : res_1; (operating elementwise). The point of the patch is to convert the last two statements into: res_s = .COND_FMA (loop_mask, avec, bvec, res_1, res_1); which is equivalent to: res_2 = loop_mask ? fma (avec, bvec, res_1) : res_1; (again operating elementwise). 2018-07-12 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * internal-fn.h (can_interpret_as_conditional_op_p): Declare. * internal-fn.c (can_interpret_as_conditional_op_p): New function. * tree-ssa-math-opts.c (convert_mult_to_fma_1): Handle conditional plus and minus and convert them into IFN_COND_FMA-based sequences. (convert_mult_to_fma): Handle conditional plus and minus. gcc/testsuite/ * gcc.dg/vect/vect-fma-2.c: New test. * gcc.target/aarch64/sve/reduc_4.c: Likewise. * gcc.target/aarch64/sve/reduc_6.c: Likewise. * gcc.target/aarch64/sve/reduc_7.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r262588
Diffstat (limited to 'gcc/internal-fn.h')
-rw-r--r--gcc/internal-fn.h3
1 files changed, 3 insertions, 0 deletions
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 7105c3bbff8..2296ca0c539 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -196,6 +196,9 @@ extern internal_fn get_conditional_internal_fn (tree_code);
extern internal_fn get_conditional_internal_fn (internal_fn);
extern tree_code conditional_internal_fn_code (internal_fn);
extern internal_fn get_unconditional_internal_fn (internal_fn);
+extern bool can_interpret_as_conditional_op_p (gimple *, tree *,
+ tree_code *, tree (&)[3],
+ tree *);
extern bool internal_load_fn_p (internal_fn);
extern bool internal_store_fn_p (internal_fn);