diff options
author | Richard Sandiford <richard.sandiford@linaro.org> | 2018-01-13 17:58:33 +0000 |
---|---|---|
committer | Richard Sandiford <rsandifo@gcc.gnu.org> | 2018-01-13 17:58:33 +0000 |
commit | f1739b4829105fa95d6ff6244632d5977169277f (patch) | |
tree | 6fa54454f34ebc19c98bb815526c6e51c8bca72d /gcc/optabs.def | |
parent | 018b2744fc7a4fe6fea1a078eae69c5465585668 (diff) |
SLP reductions with variable-length vectors
Two things stopped us using SLP reductions with variable-length vectors:
(1) We didn't have a way of constructing the initial vector.
This patch does it by creating a vector full of the neutral
identity value and then using a shift-and-insert function
to insert any non-identity inputs into the low-numbered elements.
(The non-identity values are needed for double reductions.)
Alternatively, for unchained MIN/MAX reductions that have no neutral
value, we instead use the same duplicate-and-interleave approach as
for SLP constant and external definitions (added by a previous
patch).
(2) The epilogue for constant-length vectors would extract the vector
elements associated with each SLP statement and do scalar arithmetic
on these individual elements. For variable-length vectors, the patch
instead creates a reduction vector for each SLP statement, replacing
the elements for other SLP statements with the identity value.
It then uses a hardware reduction instruction on each vector.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/md.texi (vec_shl_insert_@var{m}): New optab.
* internal-fn.def (VEC_SHL_INSERT): New internal function.
* optabs.def (vec_shl_insert_optab): New optab.
* tree-vectorizer.h (can_duplicate_and_interleave_p): Declare.
(duplicate_and_interleave): Likewise.
* tree-vect-loop.c: Include internal-fn.h.
(neutral_op_for_slp_reduction): New function, split out from
get_initial_defs_for_reduction.
(get_initial_def_for_reduction): Handle option 2 for variable-length
vectors by loading the neutral value into a vector and then shifting
the initial value into element 0.
(get_initial_defs_for_reduction): Replace the code argument with
the neutral value calculated by neutral_op_for_slp_reduction.
Use gimple_build_vector for constant-length vectors.
Use IFN_VEC_SHL_INSERT for variable-length vectors if all
but the first group_size elements have a neutral value.
Use duplicate_and_interleave otherwise.
(vect_create_epilog_for_reduction): Take a neutral_op parameter.
Update call to get_initial_defs_for_reduction. Handle SLP
reductions for variable-length vectors by creating one vector
result for each scalar result, with the elements associated
with other scalar results stubbed out with the neutral value.
(vectorizable_reduction): Call neutral_op_for_slp_reduction.
Require IFN_VEC_SHL_INSERT for double reductions on
variable-length vectors, or SLP reductions that have
a neutral value. Require can_duplicate_and_interleave_p
support for variable-length unchained SLP reductions if there
is no neutral value, such as for MIN/MAX reductions. Also require
the number of vector elements to be a multiple of the number of
SLP statements when doing variable-length unchained SLP reductions.
Update call to vect_create_epilog_for_reduction.
* tree-vect-slp.c (can_duplicate_and_interleave_p): Make public
and remove initial values.
(duplicate_and_interleave): Make public.
* config/aarch64/aarch64.md (UNSPEC_INSR): New unspec.
* config/aarch64/aarch64-sve.md (vec_shl_insert_<mode>): New insn.
gcc/testsuite/
* gcc.dg/vect/pr37027.c: Remove XFAIL for variable-length vectors.
* gcc.dg/vect/pr67790.c: Likewise.
* gcc.dg/vect/slp-reduc-1.c: Likewise.
* gcc.dg/vect/slp-reduc-2.c: Likewise.
* gcc.dg/vect/slp-reduc-3.c: Likewise.
* gcc.dg/vect/slp-reduc-5.c: Likewise.
* gcc.target/aarch64/sve/slp_5.c: New test.
* gcc.target/aarch64/sve/slp_5_run.c: Likewise.
* gcc.target/aarch64/sve/slp_6.c: Likewise.
* gcc.target/aarch64/sve/slp_6_run.c: Likewise.
* gcc.target/aarch64/sve/slp_7.c: Likewise.
* gcc.target/aarch64/sve/slp_7_run.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256623
Diffstat (limited to 'gcc/optabs.def')
-rw-r--r-- | gcc/optabs.def | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/gcc/optabs.def b/gcc/optabs.def index c22708b6943..ec5f5f544ea 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -368,3 +368,4 @@ OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a") OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE) OPTAB_DC (vec_series_optab, "vec_series$a", VEC_SERIES) +OPTAB_D (vec_shl_insert_optab, "vec_shl_insert_$a") |