summaryrefslogtreecommitdiff
path: root/gcc/tree-vect-slp.c
AgeCommit message (Collapse)Author
2020-05-08move permutation validity checkRichard Biener
This delays the SLP permutation check to vectorizable_load and optimizes permutations only after all SLP instances have been generated and the vectorization factor is determined. 2020-05-08 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (vec_info::slp_loads): New. (vect_optimize_slp): Declare. * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do nothing when there are no loads. (vect_gather_slp_loads): Gather loads into a vector. (vect_supported_load_permutation_p): Remove. (vect_analyze_slp_instance): Do not verify permutation validity here. (vect_analyze_slp): Optimize permutations of reductions after all SLP instances have been gathered and gather all loads. (vect_optimize_slp): New function split out from vect_supported_load_permutation_p. Elide some permutations. (vect_slp_analyze_bb_1): Call vect_optimize_slp. * tree-vect-loop.c (vect_analyze_loop_2): Likewise. * tree-vect-stmts.c (vectorizable_load): Check whether the load can be permuted. When generating code assert we can. * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported SLP permutations becoming builds from scalars. * gcc.dg/vect/bb-slp-pr78205.c: Likewise. * gcc.dg/vect/bb-slp-34.c: Likewise.
2020-05-06Prepare removal of SLP_INSTANCE_GROUP_SIZERichard Biener
This removes trivial instances of SLP_INSTANCE_GROUP_SIZE and refrains from using a "SLP instance" which nowadays is just one of the possibly many entries into the SLP graph. 2020-05-06 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (vect_transform_slp_perm_load): Adjust. * tree-vect-data-refs.c (vect_slp_analyze_node_dependences): Remove slp_instance parameter, just iterate over all scalar stmts. (vect_slp_analyze_instance_dependence): Adjust and likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Remove unused BB parameter. (vect_schedule_slp): Just iterate over all scalar stmts. (vect_supported_load_permutation_p): Adjust. (vect_transform_slp_perm_load): Remove slp_instance parameter, instead use the number of lanes in the node as group size. * tree-vect-stmts.c (vect_model_load_cost): Get vectorization factor instead of slp_instance as parameter. (vectorizable_load): Adjust.
2020-05-05rewrite hybrid SLP detectionRichard Biener
This rewrites hybrid SLP detection to be simpler and cope with group size changes in the SLP graph. In particular detection works starting from non-SLP stmts following use->def chains rather than walking the SLP graph and following def->use chains. It's all temporary of course since non-SLP and thus hybrid SLP will go away. 2020-05-05 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (struct vdhs_data): New. (vect_detect_hybrid_slp): New walker. (vect_detect_hybrid_slp): Rewrite.
2020-05-05add vec_info * parameters where neededRichard Biener
Soonish we'll get SLP nodes which have no corresponding scalar stmt and thus not stmt_vec_info and thus no way to get back to the associated vec_info. This patch makes the vec_info available as part of the APIs instead of putting in that back-pointer into the leaf data structures. 2020-05-05 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::vinfo): Remove. (STMT_VINFO_LOOP_VINFO): Likewise. (STMT_VINFO_BB_VINFO): Likewise. * tree-vect-data-refs.c: Adjust for the above, adding vec_info * parameters and adjusting calls. * tree-vect-loop-manip.c: Likewise. * tree-vect-loop.c: Likewise. * tree-vect-patterns.c: Likewise. * tree-vect-slp.c: Likewise. * tree-vect-stmts.c: Likewise. * tree-vectorizer.c: Likewise. * target.def (add_stmt_cost): Add vec_info * parameter. * target.h (stmt_in_inner_loop_p): Likewise. * targhooks.c (default_add_stmt_cost): Adjust. * doc/tm.texi: Re-generate. * config/aarch64/aarch64.c (aarch64_extending_load_p): Add vec_info * parameter and adjust. (aarch64_sve_adjust_stmt_cost): Likewise. (aarch64_add_stmt_cost): Likewise. * config/arm/arm.c (arm_add_stmt_cost): Likewise. * config/i386/i386.c (ix86_add_stmt_cost): Likewise. * config/rs6000/rs6000.c (rs6000_add_stmt_cost): Likewise.
2020-03-23tree-optimization/94261 - avoid IL adjustments in SLP analysisRichard Biener
The remaining IL adjustment done by SLP analysis turns out harmful since we share them in the now multiple analyses states. It turns out we do not actually need those apart from the case where we reorg scalar stmts during re-arrangement when optimizing load permutations in SLP reductions. But that isn't needed either now since we only need to permute non-isomorphic parts which now reside in separate SLP nodes who are all leafs. 2020-03-23 Richard Biener <rguenther@suse.de> PR tree-optimization/94261 * tree-vect-slp.c (vect_get_and_check_slp_defs): Remove IL operand swapping code. (vect_slp_rearrange_stmts): Do not arrange isomorphic nodes that would need operation code adjustments.
2020-03-20adjust SLP tree dumpingRichard Biener
This also dumps the root node we eventually smuggle in. 2020-03-20 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_analyze_slp_instance): Dump SLP tree from the possibly modified root.
2020-03-20fix CTOR vectorizationRichard Biener
We failed to handle pattern stmts appropriately. 2020-03-20 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_analyze_slp_instance): Push the stmts to vectorize for CTOR defs.
2020-02-27tree-optimization/93953 - avoid reference into hash-mapRichard Biener
When possibly expanding a hash-map avoid keeping a reference to an entry. 2020-02-27 Richard Biener <rguenther@suse.de> PR tree-optimization/93953 * tree-vect-slp.c (slp_copy_subtree): Avoid keeping a reference to the hash-map entry. * gcc.dg/pr93953.c: New testcase.
2020-02-26dump load permutations and refcount per SLP nodeRichard Biener
This adjusts dumping as proved useful in debugging. 2020-02-26 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_print_slp_tree): Also dump ref count and load permutation.
2020-02-25tree-optimization/93868 copy SLP tree before re-arranging stmtsRichard Biener
This avoids altering possibly shared SLP subtrees when attempting to get rid of permutations in SLP reductions by copying the SLP subtree before re-arranging stmts in it. 2020-02-25 Richard Biener <rguenther@suse.de> PR tree-optimization/93868 * tree-vect-slp.c (slp_copy_subtree): New function. (vect_attempt_slp_rearrange_stmts): Copy the SLP tree before re-arranging stmts in it. * gcc.dg/torture/pr93868.c: New testcase.
2020-01-29tree-optimization/93428 - avoid load permutation vector clobberingRichard Biener
With SLP now being a graph with shared nodes across instances we have to make sure to compute the load permutation of nodes once, not overwriting the result of earlier analysis. 2020-01-28 Richard Biener <rguenther@suse.de> PR tree-optimization/93428 * tree-vect-slp.c (vect_build_slp_tree_2): Compute the load permutation when the load node is created. (vect_analyze_slp_instance): Re-use it here. * gcc.dg/torture/pr93428.c: New testcase.
2020-01-27tree-optimization/93397 delay converted reduction chain adjustmentRichard Biener
The following delays adjusting the SLP graph for converted reduction chains to a point where the SLP build no longer can fail since we otherwise fail to undo marking the conversion as a group. 2020-01-27 Richard Biener <rguenther@suse.de> PR tree-optimization/93397 * tree-vect-slp.c (vect_analyze_slp_instance): Delay converted reduction chain SLP graph adjustment. * gcc.dg/torture/pr93397.c: New testcase.
2020-01-15Fix type mismatch in SLPed constructorsRichard Sandiford
Having the "same" vector types with different modes means that we can end up vectorising a constructor with a different mode from the lhs. This patch adds a VIEW_CONVERT_EXPR in that case. This showed up on existing tests when testing with fixed-length -msve-vector-bits=128. 2020-01-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-slp.c (vectorize_slp_instance_root_stmt): Use a VIEW_CONVERT_EXPR if the vectorized constructor has a diffeent type from the lhs.
2020-01-14hash-table.h: support non-zero empty values in empty_slow (v2)David Malcolm
gcc/cp/ChangeLog: * cp-gimplify.c (source_location_table_entry_hash::empty_zero_p): New static constant. * cp-tree.h (named_decl_hash::empty_zero_p): Likewise. (struct named_label_hash::empty_zero_p): Likewise. * decl2.c (mangled_decl_hash::empty_zero_p): Likewise. gcc/ChangeLog: * attribs.c (excl_hash_traits::empty_zero_p): New static constant. * gcov.c (function_start_pair_hash::empty_zero_p): Likewise. * graphite.c (struct sese_scev_hash::empty_zero_p): Likewise. * hash-map-tests.c (selftest::test_nonzero_empty_key): New selftest. (selftest::hash_map_tests_c_tests): Call it. * hash-map-traits.h (simple_hashmap_traits::empty_zero_p): New static constant, using the value of = H::empty_zero_p. (unbounded_hashmap_traits::empty_zero_p): Likewise, using the value from default_hash_traits <Value>. * hash-map.h (hash_map::empty_zero_p): Likewise, using the value from Traits. * hash-set-tests.c (value_hash_traits::empty_zero_p): Likewise. * hash-table.h (hash_table::alloc_entries): Guard the loop of calls to mark_empty with !Descriptor::empty_zero_p. (hash_table::empty_slow): Conditionalize the memset call with a check that Descriptor::empty_zero_p; otherwise, loop through the entries calling mark_empty on them. * hash-traits.h (int_hash::empty_zero_p): New static constant. (pointer_hash::empty_zero_p): Likewise. (pair_hash::empty_zero_p): Likewise. * ipa-devirt.c (default_hash_traits <type_pair>::empty_zero_p): Likewise. * ipa-prop.c (ipa_bit_ggc_hash_traits::empty_zero_p): Likewise. (ipa_vr_ggc_hash_traits::empty_zero_p): Likewise. * profile.c (location_triplet_hash::empty_zero_p): Likewise. * sanopt.c (sanopt_tree_triplet_hash::empty_zero_p): Likewise. (sanopt_tree_couple_hash::empty_zero_p): Likewise. * tree-hasher.h (int_tree_hasher::empty_zero_p): Likewise. * tree-ssa-sccvn.c (vn_ssa_aux_hasher::empty_zero_p): Likewise. * tree-vect-slp.c (bst_traits::empty_zero_p): Likewise. * tree-vectorizer.h (default_hash_traits<scalar_cond_masked_key>::empty_zero_p): Likewise.
2020-01-06Require equal shift amounts for IFN_DIV_POW2Richard Sandiford
IFN_DIV_POW2 currently requires all elements to be shifted by the same amount, in a similar way as for WIDEN_LSHIFT_EXPR. This patch enforces that when building the SLP tree. If in future targets want to support IFN_DIV_POW2 without this restriction, we'll probably need the kind of vector-vector/ vector-scalar split that we already have for normal shifts. 2020-01-06 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-slp.c (vect_build_slp_tree_1): Require all shifts in an IFN_DIV_POW2 node to be equal. gcc/testsuite/ * gcc.target/aarch64/sve/asrdiv_1.c: Remove trailing %s. * gcc.target/aarch64/sve/asrdiv_2.c: New test. * gcc.target/aarch64/sve/asrdiv_3.c: Likewise. From-SVN: r279908
2020-01-01Update copyright years.Jakub Jelinek
From-SVN: r279813
2019-11-29re PR fortran/91003 (ICE when compiling LAPACK (CGEGV) with optimization)Richard Biener
2019-11-29 Richard Biener <rguenther@suse.de> PR tree-optimization/91003 * tree-vect-slp.c (vect_mask_constant_operand_p): Pass in the operand number, avoid handling the non-condition operands of COND_EXPRs as comparisons. (vect_get_constant_vectors): Pass down the operand number. (vect_get_slp_defs): Likewise. * gfortran.dg/pr91003.f90: New testcase. From-SVN: r278860
2019-11-29Don't defer choice of vector type for bools (PR 92596)Richard Sandiford
Now that stmt_vec_info records the choice between vector mask types and normal nonmask types, we can use that information in vect_get_vector_types_for_stmt instead of deferring the choice of vector type till later. vect_get_mask_type_for_stmt used to check whether the boolean inputs to an operation: (a) consistently used mask types or consistently used nonmask types; and (b) agreed on the number of elements. (b) shouldn't be a problem when (a) is met. If the operation consistently uses mask types, tree-vect-patterns.c will have corrected any mismatches in mask precision. (This is because we only use mask types for a small well-known set of operations and tree-vect-patterns.c knows how to handle any that could have different mask precisions.) And if the operation consistently uses normal nonmask types, there's no reason why booleans should need extra vector compatibility checks compared to ordinary integers. So the potential difficulties all seem to come from (a). Now that we've chosen the result type ahead of time, we also have to consider whether the outputs and inputs consistently use mask types. Taking each vectorizable_* routine in turn: - vectorizable_call vect_get_vector_types_for_stmt only handled booleans specially for gassigns, so vect_get_mask_type_for_stmt never had chance to handle calls. I'm not sure we support any calls that operate on booleans, but as things stand, a boolean result would always have a nonmask type. Presumably any vector argument would also need to use nonmask types, unless it corresponds to internal_fn_mask_index (which is already a special case). For safety, I've added a check for mask/nonmask combinations here even though we didn't check this previously. - vectorizable_simd_clone_call Again, vect_get_mask_type_for_stmt never had chance to handle calls. The result of the call will always be a nonmask type and the patch for PR 92710 rejects mask arguments. So all booleans should consistently use nonmask types here. - vectorizable_conversion The function already rejects any conversion between booleans in which one type isn't a mask type. - vectorizable_operation This function definitely needs a consistency check, e.g. to handle & and | in which one operand is loaded from memory and the other is a comparison result. Ideally we'd handle this via pattern stmts instead (like we do for the all-mask case), but that's future work. - vectorizable_assignment VECT_SCALAR_BOOLEAN_TYPE_P requires single-bit precision, so the current code already rejects problematic cases. - vectorizable_load Loads always produce nonmask types and there are no relevant inputs to check against. - vectorizable_store vect_check_store_rhs already rejects mask/nonmask combinations via useless_type_conversion_p. - vectorizable_reduction - vectorizable_lc_phi PHIs always have nonmask types. After the change above, attempts to combine the PHI result with a mask type would be rejected by vectorizable_operation. (Again, it would be better to handle this using pattern stmts.) - vectorizable_induction We don't generate inductions for booleans. - vectorizable_shift The function already rejects boolean shifts via type_has_mode_precision_p. - vectorizable_condition The function already rejects mismatches via useless_type_conversion_p. - vectorizable_comparison The function already rejects comparisons between mask and nonmask types. The result is always a mask type. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92596 * tree-vect-stmts.c (vectorizable_call): Punt on hybrid mask/nonmask operations. (vectorizable_operation): Likewise, instead of relying on vect_get_mask_type_for_stmt to do this. (vect_get_vector_types_for_stmt): Always return a vector type immediately, rather than deferring the choice for boolean results. Use a vector mask type instead of a normal vector if vect_use_mask_type_p. (vect_get_mask_type_for_stmt): Delete. * tree-vect-loop.c (vect_determine_vf_for_stmt_1): Remove mask_producers argument and special boolean_type_node handling. (vect_determine_vf_for_stmt): Remove mask_producers argument and update calls to vect_determine_vf_for_stmt_1. Remove doubled call. (vect_determine_vectorization_factor): Update call accordingly. * tree-vect-slp.c (vect_build_slp_tree_1): Remove special boolean_type_node handling. (vect_slp_analyze_node_operations_1): Likewise. gcc/testsuite/ PR tree-optimization/92596 * gcc.dg/vect/bb-slp-pr92596.c: New test. * gcc.dg/vect/bb-slp-43.c: Likewise. From-SVN: r278851
2019-11-29Make vect_get_mask_type_for_stmt take a group sizeRichard Sandiford
This patch makes vect_get_mask_type_for_stmt and get_mask_type_for_scalar_type take a group size instead of the SLP node, so that later patches can call it before an SLP node has been built. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (get_mask_type_for_scalar_type): Replace the slp_tree parameter with a group size parameter. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-slp.c (vect_slp_analyze_node_operations_1): Update call accordingly. From-SVN: r278849
2019-11-26re PR tree-optimization/92645 (Hand written vector code is 450 times slower ↵Richard Biener
when compiled with GCC compared to Clang) 2019-11-26 Richard Biener <rguenther@suse.de> PR tree-optimization/92645 * tree-vect-slp.c (vect_build_slp_tree_2): For unary ops do not build the operation from scalars if the operand is. * gcc.target/i386/pr92645.c: New testcase. From-SVN: r278719
2019-11-25tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Add assertion.Richard Biener
2019-11-25 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Add assertion. (vect_detect_hybrid_slp): Swap lane and instance iteration, properly re-building the visited hash-map for each lane. From-SVN: r278679
2019-11-21re PR tree-optimization/92596 (ICE in exact_div, at poly-int.h:2162 since ↵Richard Biener
r278406) 2019-11-21 Richard Biener <rguenther@suse.de> PR tree-optimization/92596 * tree-vect-slp.c (vect_build_slp_tree): Fix pasto. * gcc.dg/torture/pr92596-1.c: New testcase. From-SVN: r278555
2019-11-20Restore stmt def types after scheduling two-operation SLPRichard Sandiford
2019-11-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-slp.c (vect_schedule_slp_instance): Restore stmt def types for two-operation SLP. From-SVN: r278533
2019-11-20tree-vect-slp.c (vect_analyze_slp_instance): Dump constructors we are ↵Richard Biener
actually analyzing. 2019-11-20 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_analyze_slp_instance): Dump constructors we are actually analyzing. (vect_slp_check_for_constructors): Do not vectorize uniform constuctors, do not dump here. * gcc.dg/vect/bb-slp-42.c: Adjust. * gcc.dg/vect/bb-slp-40.c: Likewise. From-SVN: r278495
2019-11-20re PR tree-optimization/92537 (ICE in vect_slp_analyze_node_operations, at ↵Richard Biener
tree-vect-slp.c:2775) 2019-11-20 Richard Biener <rguenther@suse.de> PR tree-optimization/92537 * tree-vect-slp.c (vect_analyze_slp_instance): Move CTOR vectorization validity check... (vect_slp_analyze_operations): ... here. * gfortran.dg/pr92537.f90: New testcase. From-SVN: r278494
2019-11-18re PR tree-optimization/92516 (ICE in vect_schedule_slp_instance, at ↵Richard Biener
tree-vect-slp.c:4095 since r278246) 2019-11-18 Richard Biener <rguenther@suse.de> PR tree-optimization/92516 * tree-vect-slp.c (vect_analyze_slp_instance): Add bst_map argument, hoist bst_map creation/destruction to ... (vect_analyze_slp): ... here, forming a true graph with SLP instances being the entries. (vect_detect_hybrid_slp_stmts): Remove wrapper. (vect_detect_hybrid_slp): Use one visited set for all graph entries. (vect_slp_analyze_node_operations): Simplify visited/lvisited to hash-sets of slp_tree. (vect_slp_analyze_operations): Likewise. (vect_bb_slp_scalar_cost): Remove wrapper. (vect_bb_vectorization_profitable_p): Use one visited set for all graph entries. (vect_schedule_slp_instance): Elide bst_map use. (vect_schedule_slp): Likewise. * g++.dg/vect/slp-pr92516.cc: New testcase. 2019-11-18 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_analyze_slp_instance): When a CTOR was vectorized with just external refs fail. * gcc.dg/vect/vect-ctor-1.c: New testcase. From-SVN: r278406
2019-11-16Extend can_duplicate_and_interleave_p to mixed-size vectorsRichard Sandiford
This patch makes can_duplicate_and_interleave_p cope with mixtures of vector sizes, by using queries based on get_vectype_for_scalar_type instead of directly querying GET_MODE_SIZE (vinfo->vector_mode). int_mode_for_size is now the first check we do for a candidate mode, so it seemed better to restrict it to MAX_FIXED_MODE_SIZE. This avoids unnecessary work and avoids trying to create scalar types that the target might not support. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (can_duplicate_and_interleave_p): Take an element type rather than an element mode. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. Use get_vectype_for_scalar_type to query the natural types for a given element type rather than basing everything on GET_MODE_SIZE (vinfo->vector_mode). Limit int_mode_for_size query to MAX_FIXED_MODE_SIZE. (duplicate_and_interleave): Update call accordingly. * tree-vect-loop.c (vectorizable_reduction): Likewise. From-SVN: r278335
2019-11-16Apply maximum nunits for BB SLPRichard Sandiford
The BB vectoriser picked vector types in the same way as the loop vectoriser: it picked a vector mode/size for the region and then based all the vector types off that choice. This meant we could end up trying to use vector types that had too many elements for the group size. The main part of this patch is therefore about passing the SLP group size down to routines like get_vectype_for_scalar_type and ensuring that each vector type in the SLP tree is chosen wrt the group size. That part in itself is pretty easy and mechanical. The main warts are: (1) We normally pick a STMT_VINFO_VECTYPE for data references at an early stage (vect_analyze_data_refs). However, nothing in the BB vectoriser relied on this, or on the min_vf calculated from it. I couldn't see anything other than vect_recog_bool_pattern that tried to access the vector type before the SLP tree is built. (2) It's possible for the same statement to be used in groups of different sizes. Taking the group size into account meant that we could try to pick different vector types for the same statement. This problem should go away with the move to doing everything on SLP trees, where presumably we would attach the vector type to the SLP node rather than the stmt_vec_info. Until then, the patch just uses a first-come, first-served approach. (3) A similar problem exists for grouped data references, where different statements in the same dataref group could be used in SLP nodes that have different group sizes. The patch copes with that by making sure that all vector types in a dataref group remain consistent. The patch means that: void f (int *x, short *y) { x[0] += y[0]; x[1] += y[1]; x[2] += y[2]; x[3] += y[3]; } now produces: ldr q0, [x0] ldr d1, [x1] saddw v0.4s, v0.4s, v1.4h str q0, [x0] ret instead of: ldrsh w2, [x1] ldrsh w3, [x1, 2] fmov s0, w2 ldrsh w2, [x1, 4] ldrsh w1, [x1, 6] ins v0.s[1], w3 ldr q1, [x0] ins v0.s[2], w2 ins v0.s[3], w1 add v0.4s, v0.4s, v1.4s str q0, [x0] ret Unfortunately it also means we start to vectorise gcc.target/i386/pr84101.c for -m32. That seems like a target cost issue though; see PR92265 for details. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_get_vector_types_for_stmt): Take an optional maximum nunits. (get_vectype_for_scalar_type): Likewise. Also declare a form that takes an slp_tree. (get_mask_type_for_scalar_type): Take an optional slp_tree. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-data-refs.c (vect_analyze_data_refs): Don't store the vector type in STMT_VINFO_VECTYPE for BB vectorization. * tree-vect-patterns.c (vect_recog_bool_pattern): Use vect_get_vector_types_for_stmt instead of STMT_VINFO_VECTYPE to get an assumed vector type for data references. * tree-vect-slp.c (vect_update_shared_vectype): New function. (vect_update_all_shared_vectypes): Likewise. (vect_build_slp_tree_1): Pass the group size to vect_get_vector_types_for_stmt. Use vect_update_shared_vectype for BB vectorization. (vect_build_slp_tree_2): Call vect_update_all_shared_vectypes before building the vectof from scalars. (vect_analyze_slp_instance): Pass the group size to get_vectype_for_scalar_type. (vect_slp_analyze_node_operations_1): Don't recompute the vector types for BB vectorization here; just handle the case in which we deferred the choice for booleans. (vect_get_constant_vectors): Pass the slp_tree to get_vectype_for_scalar_type. * tree-vect-stmts.c (vect_prologue_cost_for_slp_op): Likewise. (vectorizable_call): Likewise. (vectorizable_simd_clone_call): Likewise. (vectorizable_conversion): Likewise. (vectorizable_shift): Likewise. (vectorizable_operation): Likewise. (vectorizable_comparison): Likewise. (vect_is_simple_cond): Take the slp_tree as argument and pass it to get_vectype_for_scalar_type. (vectorizable_condition): Update call accordingly. (get_vectype_for_scalar_type): Take a group_size argument. For BB vectorization, limit the the vector to that number of elements. Also define an overload that takes an slp_tree. (get_mask_type_for_scalar_type): Add an slp_tree argument and pass it to get_vectype_for_scalar_type. (vect_get_vector_types_for_stmt): Add a group_size argument and pass it to get_vectype_for_scalar_type. Don't use the cached vector type for BB vectorization if a group size is given. Handle data references in that case. (vect_get_mask_type_for_stmt): Take an slp_tree argument and pass it to get_mask_type_for_scalar_type. gcc/testsuite/ * gcc.dg/vect/bb-slp-4.c: Expect the block to be vectorized with -fno-vect-cost-model. * gcc.dg/vect/bb-slp-bool-1.c: New test. * gcc.target/aarch64/vect_mixed_sizes_14.c: Likewise. * gcc.target/i386/pr84101.c: XFAIL for -m32. From-SVN: r278334
2019-11-14Consider building nodes from scalars in vect_slp_analyze_node_operationsRichard Sandiford
If the statements in an SLP node aren't similar enough to be vectorised, or aren't something the vectoriser has code to handle, the BB vectoriser tries building the vector from scalars instead. This patch does the same thing if we're able to build a viable-looking tree but fail later during the analysis phase, e.g. because the target doesn't support a particular vector operation. This is needed to avoid regressions with a later patch. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-slp.c (vect_contains_pattern_stmt_p): New function. (vect_slp_convert_to_external): Likewise. (vect_slp_analyze_node_operations): If analysis fails, try building the node from scalars instead. gcc/testsuite/ * gcc.dg/vect/bb-slp-div-2.c: New test. From-SVN: r278246
2019-11-14Avoid retrying with the same vector modesRichard Sandiford
A later patch makes the AArch64 port add four entries to autovectorize_vector_modes. Each entry describes a different vector mode assignment for vector code that mixes 8-bit, 16-bit, 32-bit and 64-bit elements. But if (as usual) the vector code has fewer element sizes than that, we could end up trying the same combination of vector modes multiple times. This patch adds a check to prevent that. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::mode_set): New typedef. (vec_info::used_vector_mode): New member variable. (vect_chooses_same_modes_p): Declare. * tree-vect-stmts.c (get_vectype_for_scalar_type): Record each chosen vector mode in vec_info::used_vector_mode. (vect_chooses_same_modes_p): New function. * tree-vect-loop.c (vect_analyze_loop): Use it to avoid trying the same vector statements multiple times. * tree-vect-slp.c (vect_slp_bb_region): Likewise. From-SVN: r278242
2019-11-14Support vectorisation with mixed vector sizesRichard Sandiford
After previous patches, it's now possible to make the vectoriser support multiple vector sizes in the same vector region, using related_vector_mode to pick the right vector mode for a given element mode. No port yet takes advantage of this, but I have a follow-on patch for AArch64. This patch also seemed like a good opportunity to add some more dump messages: one to make it clear which vector size/mode was being used when analysis passed or failed, and another to say when we've decided to skip a redundant vector size/mode. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * machmode.h (opt_machine_mode::operator==): New function. (opt_machine_mode::operator!=): Likewise. * tree-vectorizer.h (vec_info::vector_mode): Update comment. (get_related_vectype_for_scalar_type): Delete. (get_vectype_for_scalar_type_and_size): Declare. * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say whether analysis passed or failed, and with what vector modes. Use related_vector_mode to check whether trying a particular vector mode would be redundant with the autodetected mode, and print a dump message if we decide to skip it. * tree-vect-loop.c (vect_analyze_loop): Likewise. (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type_and_size. * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace with... (get_related_vectype_for_scalar_type): ...this new function. Take a starting/"prevailing" vector mode rather than a vector size. Take an optional nunits argument, with the same meaning as for related_vector_mode. Use related_vector_mode when not auto-detecting a mode, falling back to mode_for_vector if no target mode exists. (get_vectype_for_scalar_type): Update accordingly. (get_same_sized_vectype): Likewise. * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. From-SVN: r278240
2019-11-14Replace vec_info::vector_size with vec_info::vector_modeRichard Sandiford
This patch replaces vec_info::vector_size with vec_info::vector_mode, but for now continues to use it as a way of specifying a single vector size. This makes it easier for later patches to use related_vector_mode instead. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::vector_size): Replace with... (vec_info::vector_mode): ...this new field. * tree-vect-loop.c (vect_update_vf_for_slp): Update accordingly. (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-loop-manip.c (vect_do_peeling): Likewise. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. (vect_make_slp_decision, vect_slp_bb_region): Likewise. * tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise. * tree-vectorizer.c (try_vectorize_loop_1): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-tail-nomask-1.c: Update expected epilogue vectorization message. From-SVN: r278237
2019-11-14Replace autovectorize_vector_sizes with autovectorize_vector_modesRichard Sandiford
This is another patch in the series to remove the assumption that all modes involved in vectorisation have to be the same size. Rather than have the target provide a list of vector sizes, it makes the target provide a list of vector "approaches", with each approach represented by a mode. A later patch will pass this mode to targetm.vectorize.related_mode to get the vector mode for a given element mode. Until then, the modes simply act as an alternative way of specifying the vector size. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (vector_sizes, auto_vector_sizes): Delete. (vector_modes, auto_vector_modes): New typedefs. * target.def (autovectorize_vector_sizes): Replace with... (autovectorize_vector_modes): ...this new hook. * doc/tm.texi.in (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Replace with... (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): ...this new hook. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * targhooks.c (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * omp-general.c (omp_max_vf): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use the number of units in the mode to calculate the maximum VF. * omp-low.c (omp_clause_aligned_alignment): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use a loop based on related_mode to iterate through all supported vector modes for a given scalar mode. * optabs-query.c (can_vec_mask_load_store_p): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. * tree-vect-loop.c (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-slp.c (vect_slp_bb_region): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes): Replace with... (aarch64_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arc/arc.c (arc_autovectorize_vector_sizes): Replace with... (arc_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arm/arm.c (arm_autovectorize_vector_sizes): Replace with... (arm_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/i386/i386.c (ix86_autovectorize_vector_sizes): Replace with... (ix86_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/mips/mips.c (mips_autovectorize_vector_sizes): Replace with... (mips_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. From-SVN: r278236
2019-11-14Remove build_{same_sized_,}truth_vector_typeRichard Sandiford
build_same_sized_truth_vector_type was confusingly named, since for SVE and AVX512 the returned vector isn't the same byte size (although it does have the same number of elements). What it really returns is the "truth" vector type for a given data vector type. The more general truth_type_for provides the same thing when passed a vector and IMO has a more descriptive name, so this patch replaces all uses of build_same_sized_truth_vector_type with that. It does the same for a call to build_truth_vector_type, leaving truth_type_for itself as the only remaining caller. It's then more natural to pass build_truth_vector_type the original vector type rather than its size and nunits, especially since the given size isn't the size of the returned vector. This in turn allows a future patch to simplify the interface of get_mask_mode. Doing this also fixes a bug in which truth_type_for would pass a size of zero for BLKmode vector types. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.h (build_truth_vector_type): Delete. (build_same_sized_truth_vector_type): Likewise. * tree.c (build_truth_vector_type): Rename to... (build_truth_vector_type_for): ...this. Make static and take a vector type as argument. (truth_type_for): Update accordingly. (build_same_sized_truth_vector_type): Delete. * tree-vect-generic.c (expand_vector_divmod): Use truth_type_for instead of build_same_sized_truth_vector_type. * tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise. (vect_record_loop_mask, vect_get_loop_mask): Likewise. * tree-vect-patterns.c (build_mask_conversion): Likeise. * tree-vect-slp.c (vect_get_constant_vectors): Likewise. * tree-vect-stmts.c (vect_get_vec_def_for_operand): Likewise. (vect_build_gather_load_calls, vectorizable_call): Likewise. (scan_store_can_perm_p, vectorizable_scan_store): Likewise. (vectorizable_store, vectorizable_condition): Likewise. (get_mask_type_for_scalar_type, get_same_sized_vectype): Likewise. (vect_get_mask_type_for_stmt): Use truth_type_for instead of build_truth_vector_type. * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred): Use truth_type_for instead of build_same_sized_truth_vector_type. * config/rs6000/rs6000-call.c (fold_build_vec_cmp): Likewise. gcc/c/ * c-typeck.c (build_conditional_expr): Use truth_type_for instead of build_same_sized_truth_vector_type. (build_vec_cmp): Likewise. gcc/cp/ * call.c (build_conditional_expr_1): Use truth_type_for instead of build_same_sized_truth_vector_type. * typeck.c (build_vec_cmp): Likewise. gcc/d/ * d-codegen.cc (build_boolop): Use truth_type_for instead of build_same_sized_truth_vector_type. From-SVN: r278232
2019-11-13Don't assign a cost to vectorizable_assignmentRichard Sandiford
vectorizable_assignment handles true SSA-to-SSA copies (which hopefully we don't see in practice) and no-op conversions that are required to maintain correct gimple, such as changes between signed and unsigned types. These cases shouldn't generate any code and so shouldn't count against either the scalar or vector costs. Later patches test this, but it seemed worth splitting out. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_nop_conversion_p): Declare. * tree-vect-stmts.c (vect_nop_conversion_p): New function. (vectorizable_assignment): Don't add a cost for nop conversions. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise. From-SVN: r278122
2019-11-12Remove gcc/params.* files.Martin Liska
2019-11-12 Martin Liska <mliska@suse.cz> * Makefile.in: Remove PARAMS_H and params.list and params.options. * params-enum.h: Remove. * params-list.h: Remove. * params-options.h: Remove. * params.c: Remove. * params.def: Remove. * params.h: Remove. * asan.c: Do not include params.h. * auto-profile.c: Likewise. * bb-reorder.c: Likewise. * builtins.c: Likewise. * cfgcleanup.c: Likewise. * cfgexpand.c: Likewise. * cfgloopanal.c: Likewise. * cgraph.c: Likewise. * combine.c: Likewise. * common/config/aarch64/aarch64-common.c: Likewise. * common/config/gcn/gcn-common.c: Likewise. * common/config/ia64/ia64-common.c: Likewise. * common/config/powerpcspe/powerpcspe-common.c: Likewise. * common/config/rs6000/rs6000-common.c: Likewise. * common/config/sh/sh-common.c: Likewise. * config/aarch64/aarch64.c: Likewise. * config/alpha/alpha.c: Likewise. * config/arm/arm.c: Likewise. * config/avr/avr.c: Likewise. * config/csky/csky.c: Likewise. * config/i386/i386-builtins.c: Likewise. * config/i386/i386-expand.c: Likewise. * config/i386/i386-features.c: Likewise. * config/i386/i386-options.c: Likewise. * config/i386/i386.c: Likewise. * config/ia64/ia64.c: Likewise. * config/rs6000/rs6000-logue.c: Likewise. * config/rs6000/rs6000.c: Likewise. * config/s390/s390.c: Likewise. * config/sparc/sparc.c: Likewise. * config/visium/visium.c: Likewise. * coverage.c: Likewise. * cprop.c: Likewise. * cse.c: Likewise. * cselib.c: Likewise. * dse.c: Likewise. * emit-rtl.c: Likewise. * explow.c: Likewise. * final.c: Likewise. * fold-const.c: Likewise. * gcc.c: Likewise. * gcse.c: Likewise. * ggc-common.c: Likewise. * ggc-page.c: Likewise. * gimple-loop-interchange.cc: Likewise. * gimple-loop-jam.c: Likewise. * gimple-loop-versioning.cc: Likewise. * gimple-ssa-split-paths.c: Likewise. * gimple-ssa-sprintf.c: Likewise. * gimple-ssa-store-merging.c: Likewise. * gimple-ssa-strength-reduction.c: Likewise. * gimple-ssa-warn-alloca.c: Likewise. * gimple-ssa-warn-restrict.c: Likewise. * graphite-isl-ast-to-gimple.c: Likewise. * graphite-optimize-isl.c: Likewise. * graphite-scop-detection.c: Likewise. * graphite-sese-to-poly.c: Likewise. * graphite.c: Likewise. * haifa-sched.c: Likewise. * hsa-gen.c: Likewise. * ifcvt.c: Likewise. * ipa-cp.c: Likewise. * ipa-fnsummary.c: Likewise. * ipa-inline-analysis.c: Likewise. * ipa-inline.c: Likewise. * ipa-polymorphic-call.c: Likewise. * ipa-profile.c: Likewise. * ipa-prop.c: Likewise. * ipa-split.c: Likewise. * ipa-sra.c: Likewise. * ira-build.c: Likewise. * ira-conflicts.c: Likewise. * loop-doloop.c: Likewise. * loop-invariant.c: Likewise. * loop-unroll.c: Likewise. * lra-assigns.c: Likewise. * lra-constraints.c: Likewise. * modulo-sched.c: Likewise. * opt-suggestions.c: Likewise. * opts.c: Likewise. * postreload-gcse.c: Likewise. * predict.c: Likewise. * reload.c: Likewise. * reorg.c: Likewise. * resource.c: Likewise. * sanopt.c: Likewise. * sched-deps.c: Likewise. * sched-ebb.c: Likewise. * sched-rgn.c: Likewise. * sel-sched-ir.c: Likewise. * sel-sched.c: Likewise. * shrink-wrap.c: Likewise. * stmt.c: Likewise. * targhooks.c: Likewise. * toplev.c: Likewise. * tracer.c: Likewise. * trans-mem.c: Likewise. * tree-chrec.c: Likewise. * tree-data-ref.c: Likewise. * tree-if-conv.c: Likewise. * tree-inline.c: Likewise. * tree-loop-distribution.c: Likewise. * tree-parloops.c: Likewise. * tree-predcom.c: Likewise. * tree-profile.c: Likewise. * tree-scalar-evolution.c: Likewise. * tree-sra.c: Likewise. * tree-ssa-ccp.c: Likewise. * tree-ssa-dom.c: Likewise. * tree-ssa-dse.c: Likewise. * tree-ssa-ifcombine.c: Likewise. * tree-ssa-loop-ch.c: Likewise. * tree-ssa-loop-im.c: Likewise. * tree-ssa-loop-ivcanon.c: Likewise. * tree-ssa-loop-ivopts.c: Likewise. * tree-ssa-loop-manip.c: Likewise. * tree-ssa-loop-niter.c: Likewise. * tree-ssa-loop-prefetch.c: Likewise. * tree-ssa-loop-unswitch.c: Likewise. * tree-ssa-math-opts.c: Likewise. * tree-ssa-phiopt.c: Likewise. * tree-ssa-pre.c: Likewise. * tree-ssa-reassoc.c: Likewise. * tree-ssa-sccvn.c: Likewise. * tree-ssa-scopedtables.c: Likewise. * tree-ssa-sink.c: Likewise. * tree-ssa-strlen.c: Likewise. * tree-ssa-structalias.c: Likewise. * tree-ssa-tail-merge.c: Likewise. * tree-ssa-threadbackward.c: Likewise. * tree-ssa-threadedge.c: Likewise. * tree-ssa-uninit.c: Likewise. * tree-switch-conversion.c: Likewise. * tree-vect-data-refs.c: Likewise. * tree-vect-loop.c: Likewise. * tree-vect-slp.c: Likewise. * tree-vrp.c: Likewise. * tree.c: Likewise. * value-prof.c: Likewise. * var-tracking.c: Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * gimple-parser.c: Do not include params.h. 2019-11-12 Martin Liska <mliska@suse.cz> * name-lookup.c: Do not include params.h. * typeck.c: Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * lto-common.c: Do not include params.h. * lto-partition.c: Likewise. * lto.c: Likewise. From-SVN: r278086
2019-11-12Apply mechanical replacement (generated patch).Martin Liska
2019-11-12 Martin Liska <mliska@suse.cz> * asan.c (asan_sanitize_stack_p): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. (asan_sanitize_allocas_p): Likewise. (asan_emit_stack_protection): Likewise. (asan_protect_global): Likewise. (instrument_derefs): Likewise. (instrument_builtin_call): Likewise. (asan_expand_mark_ifn): Likewise. * auto-profile.c (auto_profile): Likewise. * bb-reorder.c (copy_bb_p): Likewise. (duplicate_computed_gotos): Likewise. * builtins.c (inline_expand_builtin_string_cmp): Likewise. * cfgcleanup.c (try_crossjump_to_edge): Likewise. (try_crossjump_bb): Likewise. * cfgexpand.c (defer_stack_allocation): Likewise. (stack_protect_classify_type): Likewise. (pass_expand::execute): Likewise. * cfgloopanal.c (expected_loop_iterations_unbounded): Likewise. (estimate_reg_pressure_cost): Likewise. * cgraph.c (cgraph_edge::maybe_hot_p): Likewise. * combine.c (combine_instructions): Likewise. (record_value_for_reg): Likewise. * common/config/aarch64/aarch64-common.c (aarch64_option_validate_param): Likewise. (aarch64_option_default_params): Likewise. * common/config/ia64/ia64-common.c (ia64_option_default_params): Likewise. * common/config/powerpcspe/powerpcspe-common.c (rs6000_option_default_params): Likewise. * common/config/rs6000/rs6000-common.c (rs6000_option_default_params): Likewise. * common/config/sh/sh-common.c (sh_option_default_params): Likewise. * config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Likewise. (aarch64_allocate_and_probe_stack_space): Likewise. (aarch64_expand_epilogue): Likewise. (aarch64_override_options_internal): Likewise. * config/alpha/alpha.c (alpha_option_override): Likewise. * config/arm/arm.c (arm_option_override): Likewise. (arm_valid_target_attribute_p): Likewise. * config/i386/i386-options.c (ix86_option_override_internal): Likewise. * config/i386/i386.c (get_probe_interval): Likewise. (ix86_adjust_stack_and_probe_stack_clash): Likewise. (ix86_max_noce_ifcvt_seq_cost): Likewise. * config/ia64/ia64.c (ia64_adjust_cost): Likewise. * config/rs6000/rs6000-logue.c (get_stack_clash_protection_probe_interval): Likewise. (get_stack_clash_protection_guard_size): Likewise. * config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise. * config/s390/s390.c (allocate_stack_space): Likewise. (s390_emit_prologue): Likewise. (s390_option_override_internal): Likewise. * config/sparc/sparc.c (sparc_option_override): Likewise. * config/visium/visium.c (visium_option_override): Likewise. * coverage.c (get_coverage_counts): Likewise. (coverage_compute_profile_id): Likewise. (coverage_begin_function): Likewise. (coverage_end_function): Likewise. * cse.c (cse_find_path): Likewise. (cse_extended_basic_block): Likewise. (cse_main): Likewise. * cselib.c (cselib_invalidate_mem): Likewise. * dse.c (dse_step1): Likewise. * emit-rtl.c (set_new_first_and_last_insn): Likewise. (get_max_insn_count): Likewise. (make_debug_insn_raw): Likewise. (init_emit): Likewise. * explow.c (compute_stack_clash_protection_loop_data): Likewise. * final.c (compute_alignments): Likewise. * fold-const.c (fold_range_test): Likewise. (fold_truth_andor): Likewise. (tree_single_nonnegative_warnv_p): Likewise. (integer_valued_real_single_p): Likewise. * gcse.c (want_to_gcse_p): Likewise. (prune_insertions_deletions): Likewise. (hoist_code): Likewise. (gcse_or_cprop_is_too_expensive): Likewise. * ggc-common.c: Likewise. * ggc-page.c (ggc_collect): Likewise. * gimple-loop-interchange.cc (MAX_NUM_STMT): Likewise. (MAX_DATAREFS): Likewise. (OUTER_STRIDE_RATIO): Likewise. * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise. * gimple-loop-versioning.cc (loop_versioning::max_insns_for_loop): Likewise. * gimple-ssa-split-paths.c (is_feasible_trace): Likewise. * gimple-ssa-store-merging.c (imm_store_chain_info::try_coalesce_bswap): Likewise. (imm_store_chain_info::coalesce_immediate_stores): Likewise. (imm_store_chain_info::output_merged_store): Likewise. (pass_store_merging::process_store): Likewise. * gimple-ssa-strength-reduction.c (find_basis_for_base_expr): Likewise. * graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple): Likewise. (scop_to_isl_ast): Likewise. * graphite-optimize-isl.c (get_schedule_for_node_st): Likewise. (optimize_isl): Likewise. * graphite-scop-detection.c (build_scops): Likewise. * haifa-sched.c (set_modulo_params): Likewise. (rank_for_schedule): Likewise. (model_add_to_worklist): Likewise. (model_promote_insn): Likewise. (model_choose_insn): Likewise. (queue_to_ready): Likewise. (autopref_multipass_dfa_lookahead_guard): Likewise. (schedule_block): Likewise. (sched_init): Likewise. * hsa-gen.c (init_prologue): Likewise. * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Likewise. (cond_move_process_if_block): Likewise. * ipa-cp.c (ipcp_lattice::add_value): Likewise. (merge_agg_lats_step): Likewise. (devirtualization_time_bonus): Likewise. (hint_time_bonus): Likewise. (incorporate_penalties): Likewise. (good_cloning_opportunity_p): Likewise. (ipcp_propagate_stage): Likewise. * ipa-fnsummary.c (decompose_param_expr): Likewise. (set_switch_stmt_execution_predicate): Likewise. (analyze_function_body): Likewise. (compute_fn_summary): Likewise. * ipa-inline-analysis.c (estimate_growth): Likewise. * ipa-inline.c (caller_growth_limits): Likewise. (inline_insns_single): Likewise. (inline_insns_auto): Likewise. (can_inline_edge_by_limits_p): Likewise. (want_early_inline_function_p): Likewise. (big_speedup_p): Likewise. (want_inline_small_function_p): Likewise. (want_inline_self_recursive_call_p): Likewise. (edge_badness): Likewise. (recursive_inlining): Likewise. (compute_max_insns): Likewise. (early_inliner): Likewise. * ipa-polymorphic-call.c (csftc_abort_walking_p): Likewise. * ipa-profile.c (ipa_profile): Likewise. * ipa-prop.c (determine_known_aggregate_parts): Likewise. (ipa_analyze_node): Likewise. (ipcp_transform_function): Likewise. * ipa-split.c (consider_split): Likewise. * ipa-sra.c (allocate_access): Likewise. (process_scan_results): Likewise. (ipa_sra_summarize_function): Likewise. (pull_accesses_from_callee): Likewise. * ira-build.c (loop_compare_func): Likewise. (mark_loops_for_removal): Likewise. * ira-conflicts.c (build_conflict_bit_table): Likewise. * loop-doloop.c (doloop_optimize): Likewise. * loop-invariant.c (gain_for_invariant): Likewise. (move_loop_invariants): Likewise. * loop-unroll.c (decide_unroll_constant_iterations): Likewise. (decide_unroll_runtime_iterations): Likewise. (decide_unroll_stupid): Likewise. (expand_var_during_unrolling): Likewise. * lra-assigns.c (spill_for): Likewise. * lra-constraints.c (EBB_PROBABILITY_CUTOFF): Likewise. * modulo-sched.c (sms_schedule): Likewise. (DFA_HISTORY): Likewise. * opts.c (default_options_optimization): Likewise. (finish_options): Likewise. (common_handle_option): Likewise. * postreload-gcse.c (eliminate_partially_redundant_load): Likewise. (if): Likewise. * predict.c (get_hot_bb_threshold): Likewise. (maybe_hot_count_p): Likewise. (probably_never_executed): Likewise. (predictable_edge_p): Likewise. (predict_loops): Likewise. (expr_expected_value_1): Likewise. (tree_predict_by_opcode): Likewise. (handle_missing_profiles): Likewise. * reload.c (find_equiv_reg): Likewise. * reorg.c (redundant_insn): Likewise. * resource.c (mark_target_live_regs): Likewise. (incr_ticks_for_insn): Likewise. * sanopt.c (pass_sanopt::execute): Likewise. * sched-deps.c (sched_analyze_1): Likewise. (sched_analyze_2): Likewise. (sched_analyze_insn): Likewise. (deps_analyze_insn): Likewise. * sched-ebb.c (schedule_ebbs): Likewise. * sched-rgn.c (find_single_block_region): Likewise. (too_large): Likewise. (haifa_find_rgns): Likewise. (extend_rgns): Likewise. (new_ready): Likewise. (schedule_region): Likewise. (sched_rgn_init): Likewise. * sel-sched-ir.c (make_region_from_loop): Likewise. * sel-sched-ir.h (MAX_WS): Likewise. * sel-sched.c (process_pipelined_exprs): Likewise. (sel_setup_region_sched_flags): Likewise. * shrink-wrap.c (try_shrink_wrapping): Likewise. * targhooks.c (default_max_noce_ifcvt_seq_cost): Likewise. * toplev.c (print_version): Likewise. (process_options): Likewise. * tracer.c (tail_duplicate): Likewise. * trans-mem.c (tm_log_add): Likewise. * tree-chrec.c (chrec_fold_plus_1): Likewise. * tree-data-ref.c (split_constant_offset): Likewise. (compute_all_dependences): Likewise. * tree-if-conv.c (MAX_PHI_ARG_NUM): Likewise. * tree-inline.c (remap_gimple_stmt): Likewise. * tree-loop-distribution.c (MAX_DATAREFS_NUM): Likewise. * tree-parloops.c (MIN_PER_THREAD): Likewise. (create_parallel_loop): Likewise. * tree-predcom.c (determine_unroll_factor): Likewise. * tree-scalar-evolution.c (instantiate_scev_r): Likewise. * tree-sra.c (analyze_all_variable_accesses): Likewise. * tree-ssa-ccp.c (fold_builtin_alloca_with_align): Likewise. * tree-ssa-dse.c (setup_live_bytes_from_ref): Likewise. (dse_optimize_redundant_stores): Likewise. (dse_classify_store): Likewise. * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise. * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise. * tree-ssa-loop-im.c (LIM_EXPENSIVE): Likewise. * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise. (try_peel_loop): Likewise. (tree_unroll_loops_completely): Likewise. * tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise. (CONSIDER_ALL_CANDIDATES_BOUND): Likewise. (MAX_CONSIDERED_GROUPS): Likewise. (ALWAYS_PRUNE_CAND_SET_BOUND): Likewise. * tree-ssa-loop-manip.c (can_unroll_loop_p): Likewise. * tree-ssa-loop-niter.c (MAX_ITERATIONS_TO_TRACK): Likewise. * tree-ssa-loop-prefetch.c (PREFETCH_BLOCK): Likewise. (L1_CACHE_SIZE_BYTES): Likewise. (L2_CACHE_SIZE_BYTES): Likewise. (should_issue_prefetch_p): Likewise. (schedule_prefetches): Likewise. (determine_unroll_factor): Likewise. (volume_of_references): Likewise. (add_subscript_strides): Likewise. (self_reuse_distance): Likewise. (mem_ref_count_reasonable_p): Likewise. (insn_to_prefetch_ratio_too_small_p): Likewise. (loop_prefetch_arrays): Likewise. (tree_ssa_prefetch_arrays): Likewise. * tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Likewise. * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Likewise. (convert_mult_to_fma): Likewise. (math_opts_dom_walker::after_dom_children): Likewise. * tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise. (hoist_adjacent_loads): Likewise. (gate_hoist_loads): Likewise. * tree-ssa-pre.c (translate_vuse_through_block): Likewise. (compute_partial_antic_aux): Likewise. * tree-ssa-reassoc.c (get_reassociation_width): Likewise. * tree-ssa-sccvn.c (vn_reference_lookup_pieces): Likewise. (vn_reference_lookup): Likewise. (do_rpo_vn): Likewise. * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Likewise. * tree-ssa-sink.c (select_best_block): Likewise. * tree-ssa-strlen.c (new_stridx): Likewise. (new_addr_stridx): Likewise. (get_range_strlen_dynamic): Likewise. (class ssa_name_limit_t): Likewise. * tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise. (create_variable_info_for_1): Likewise. (init_alias_vars): Likewise. * tree-ssa-tail-merge.c (find_clusters_1): Likewise. (tail_merge_optimize): Likewise. * tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path): Likewise. (thread_jumps::fsm_find_control_statement_thread_paths): Likewise. (thread_jumps::find_jump_threads_backwards): Likewise. * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Likewise. * tree-ssa-uninit.c (compute_control_dep_chain): Likewise. * tree-switch-conversion.c (switch_conversion::check_range): Likewise. (jump_table_cluster::can_be_handled): Likewise. * tree-switch-conversion.h (jump_table_cluster::case_values_threshold): Likewise. (SWITCH_CONVERSION_BRANCH_RATIO): Likewise. (param_switch_conversion_branch_ratio): Likewise. * tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Likewise. (vect_enhance_data_refs_alignment): Likewise. (vect_prune_runtime_alias_test_list): Likewise. * tree-vect-loop.c (vect_analyze_loop_costing): Likewise. (vect_get_datarefs_in_loop): Likewise. (vect_analyze_loop): Likewise. * tree-vect-slp.c (vect_slp_bb): Likewise. * tree-vectorizer.h: Likewise. * tree-vrp.c (find_switch_asserts): Likewise. (vrp_prop::check_mem_ref): Likewise. * tree.c (wide_int_to_tree_1): Likewise. (cache_integer_cst): Likewise. * var-tracking.c (EXPR_USE_DEPTH): Likewise. (reverse_op): Likewise. (vt_find_locations): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * gimple-parser.c (c_parser_parse_gimple_body): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. 2019-11-12 Martin Liska <mliska@suse.cz> * name-lookup.c (namespace_hints::namespace_hints): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * typeck.c (comptypes): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * lto-partition.c (lto_balanced_map): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * lto.c (do_whole_program_analysis): Likewise. From-SVN: r278085
2019-11-04SLP: Initialize variable to fix bootstrap after r277784.Tamar Christina
This initializes the rstmt variable with NULL and adds an assert to check that a value has been given by one of the if cases before use. This fixes the bootstrap failure due to -Werror. Committed under the gcc obvious rule. gcc/ChangeLog: * tree-vect-slp.c (vectorize_slp_instance_root_stmt): Initialize rstmt. From-SVN: r277788
2019-11-04[SLP] SLP vectorization: vectorize vector constructorsJoel Hutton
gcc/ChangeLog: 2019-11-04 Joel Hutton <Joel.Hutton@arm.com> * expr.c (store_constructor): Modify to handle single element vectors. * tree-vect-slp.c (vect_analyze_slp_instance): Add case for vector constructors. (vect_slp_check_for_constructors): New function. (vect_slp_analyze_bb_1): Call new function to check for vector constructors. (vectorize_slp_instance_root_stmt): New function. (vect_schedule_slp): Call new function to vectorize root stmt of vector constructors. * tree-vectorizer.h (SLP_INSTANCE_ROOT_STMT): New field. gcc/testsuite/ChangeLog: 2019-11-04 Joel Hutton <Joel.Hutton@arm.com> * gcc.dg/vect/bb-slp-40.c: New test. * gcc.dg/vect/bb-slp-41.c: New test. From-SVN: r277784
2019-10-30re PR tree-optimization/65930 (Reduction with sign-change not handled)Richard Biener
2019-10-30 Richard Biener <rguenther@suse.de> PR tree-optimization/65930 * tree-vect-loop.c (vect_is_simple_reduction): For reduction chains also allow a leading and trailing conversion. * tree-vect-slp.c (vect_get_and_check_slp_defs): Handle intermediate reduction chains. (vect_analyze_slp_instance): Likewise. Build a SLP node for a trailing conversion manually. * gcc.dg/vect/pr65930-2.c: New testcase. From-SVN: r277603
2019-10-29re PR tree-optimization/92260 (ICE in exact_div, at poly-int.h:2162)Richard Biener
2019-10-29 Richard Biener <rguenther@suse.de> PR tree-optimization/92260 * tree-vect-slp.c (vect_get_constant_vectors): Special-case lane-reducing ops. * gcc.dg/pr92260.c: New testcase. From-SVN: r277571
2019-10-28re PR tree-optimization/92252 (ICE: Segmentation fault (in ↵Richard Biener
vect_stmt_to_vectorize)) 2019-10-28 Richard Biener <rguenther@suse.de> PR tree-optimization/92252 * tree-vect-slp.c (vect_get_and_check_slp_defs): Adjust STMT_VINFO_REDUC_IDX when swapping operands. * gcc.dg/torture/pr92252.c: New testcase. From-SVN: r277517
2019-10-25re PR tree-optimization/92222 (ice in useless_type_conversion_p, at ↵Richard Biener
gimple-expr.c:86) 2019-10-25 Richard Biener <rguenther@suse.de> PR tree-optimization/92222 * tree-vect-slp.c (_slp_oprnd_info::first_pattern): Remove. (_slp_oprnd_info::second_pattern): Likewise. (_slp_oprnd_info::any_pattern): New. (vect_create_oprnd_info): Adjust. (vect_get_and_check_slp_defs): Compute whether any stmt is in a pattern. (vect_build_slp_tree_2): Avoid building up a node from scalars if any of the operand defs, not just the first, is in a pattern. * gcc.dg/torture/pr92222.c: New testcase. From-SVN: r277448
2019-10-25tree-vect-slp.c (vect_get_and_check_slp_defs): Only fail swapping if we ↵Richard Biener
actually have to modify the IL on a shared stmt. 2019-10-25 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_get_and_check_slp_defs): Only fail swapping if we actually have to modify the IL on a shared stmt. (vect_build_slp_tree_2): Never fail swapping on shared stmts because we no longer modify the IL. From-SVN: r277446
2019-10-24tree-vect-slp.c (vect_get_and_check_slp_defs): For reduction chains try ↵Richard Biener
harder with operand swapping and instead of putting a... 2019-10-24 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_get_and_check_slp_defs): For reduction chains try harder with operand swapping and instead of putting a shifted chain into the reduction operands put a repetition of the final reduction op there as if we'd reassociate the expression. * gcc.dg/vect/slp-reduc-10a.c: New testcase. * gcc.dg/vect/slp-reduc-10b.c: Likewise. * gcc.dg/vect/slp-reduc-10c.c: Likewise. * gcc.dg/vect/slp-reduc-10d.c: Likewise. * gcc.dg/vect/slp-reduc-10e.c: Likewise. From-SVN: r277406
2019-10-24tree-vect-slp.c (vect_analyze_slp): When reduction group SLP discovery fails ↵Richard Biener
try to handle the reduction as part of... 2019-10-24 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_analyze_slp): When reduction group SLP discovery fails try to handle the reduction as part of SLP reduction discovery. * gcc.dg/vect/slp-reduc-9.c: New testcase. From-SVN: r277366
2019-10-23tree-vect-slp.c (vect_build_slp_tree_2): Do not build op from scalars in ↵Richard Biener
case there's a constant operand in its definition. 2019-10-23 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_build_slp_tree_2): Do not build op from scalars in case there's a constant operand in its definition. From-SVN: r277308
2019-10-22re PR tree-optimization/92173 (ICE in optab_for_tree_code, at optabs-tree.c:81)Richard Biener
2019-10-22 Richard Biener <rguenther@suse.de> PR tree-optimization/92173 * tree-vect-loop.c (vectorizable_reduction): If vect_transform_reduction cannot handle code-generation try without the single-def-use-cycle optimization. Pass optab_vector to optab_for_tree_code to get vector shifts as that's what we'd generate. * gcc.dg/torture/pr92173.c: New testcase. From-SVN: r277288
2019-10-22Fix use after free in vector_size changeRichard Sandiford
r277235 was a bit too mechanical and ended up introducing use after free bugs in both loop and SLP vectorisation. 2019-10-22 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-slp.c (vect_slp_bb_region): Check whether autodetected_vector_size rather than vector_size is zero. * tree-vect-loop.c (vect_analyze_loop): Likewise. Set autodetected_vector_size immediately after calling vect_analyze_loop_2. Check for a fatal error before advancing next_size. From-SVN: r277282
2019-10-21tree-vectorizer.h (_slp_tree::ops): New member.Richard Biener
2019-10-21 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_slp_tree::ops): New member. (SLP_TREE_SCALAR_OPS): New. (vect_get_slp_defs): Adjust prototype. * tree-vect-slp.c (vect_free_slp_tree): Release SLP_TREE_SCALAR_OPS. (vect_create_new_slp_node): Initialize it. New overload for initializing by an operands array. (_slp_oprnd_info::ops): New member. (vect_create_oprnd_info): Initialize it. (vect_free_oprnd_info): Release it. (vect_get_and_check_slp_defs): Populate the operands array. Do not swap operands in the IL when not necessary. (vect_build_slp_tree_2): Build SLP nodes for invariant operands. Record SLP_TREE_SCALAR_OPS for all invariant nodes. Also swap operands in the operands array. Do not swap operands in the IL. (vect_slp_rearrange_stmts): Re-arrange SLP_TREE_SCALAR_OPS as well. (vect_gather_slp_loads): Fix. (vect_detect_hybrid_slp_stmts): Likewise. (vect_slp_analyze_node_operations_1): Search for a internal def child for computing reduction SLP_TREE_NUMBER_OF_VEC_STMTS. (vect_slp_analyze_node_operations): Skip ops-only stmts for the def-type push/pop dance. (vect_get_constant_vectors): Compute number_of_vectors here. Use SLP_TREE_SCALAR_OPS and simplify greatly. (vect_get_slp_vect_defs): Use gimple_get_lhs also for PHIs. (vect_get_slp_defs): Simplify greatly. * tree-vect-loop.c (vectorize_fold_left_reduction): Simplify. (vect_transform_reduction): Likewise. * tree-vect-stmts.c (vect_get_vec_defs): Simplify. (vectorizable_call): Likewise. (vectorizable_operation): Likewise. (vectorizable_load): Likewise. (vectorizable_condition): Likewise. (vectorizable_comparison): Likewise. From-SVN: r277241