ampere-computing/gcc-experimental.git

Age	Commit message (Collapse)	Author
2020-05-08	match.pd: A ^ ((A ^ B) & -(C cmp D)) -> (C cmp D) ? B : A simplification ↵	Jakub Jelinek
	[PR94786] We already have x - ((x - y) & -(z < w)) and x + ((y - x) & -(z < w)) simplifications, this one adds x ^ ((x ^ y) & -(z < w)) (not merged using for because of the :c that can be present on bit_xor and can't on minus). 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94786 * match.pd (A ^ ((A ^ B) & -(C cmp D)) -> (C cmp D) ? B : A): New simplification. * gcc.dg/tree-ssa/pr94786.c: New test.
2020-05-08	match.pd: Canonicalize (X + (X >> (prec - 1))) ^ (X >> (prec - 1)) to abs ↵	Jakub Jelinek
	(X) [PR94783] The following patch canonicalizes M = X >> (prec - 1); (X + M) ^ M for signed integral types into ABS_EXPR (X). For X == min it is already UB because M is -1 and min + -1 is UB, so we can use ABS_EXPR rather than say ABSU_EXPR + cast. The backend might then emit the abs code back using the shift and addition and xor if it is the best sequence for the target, but could do something different that is better. 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94783 * match.pd ((X + (X >> (prec - 1))) ^ (X >> (prec - 1)) to abs (X)): New simplification. * gcc.dg/tree-ssa/pr94783.c: New test.
2020-05-08	match.pd: Optimize ffs of known non-zero arg into ctz + 1 [PR94956]	Jakub Jelinek
	The ffs expanders on several targets (x86, ia64, aarch64 at least) emit a conditional move or similar code to handle the case when the argument is 0, which makes the code longer. If we know from VRP that the argument will not be zero, we can (if the target has also an ctz expander) just use ctz which is undefined at zero and thus the expander doesn't need to deal with that. 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94956 * match.pd (FFS): Optimize __builtin_ffs* of non-zero argument into __builtin_ctz* + 1 if direct IFN_CTZ is supported. * gcc.target/i386/pr94956.c: New test.
2020-05-08	match.pd: Simplify unsigned A - B - 1 >= A to B >= A [PR94913]	Jakub Jelinek
	Implemented thusly. The TYPE_OVERFLOW_WRAPS is there just because the pattern above it has it too, if you want, I can throw it away from both. 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94913 * match.pd (A - B + -1 >= A to B >= A): New simplification. (A - B > A to A < B): Don't test TYPE_OVERFLOW_WRAPS which is always true for TYPE_UNSIGNED integral types. * gcc.dg/tree-ssa/pr94913.c: New test.
2020-05-06	match.pd: Optimize ~(~X +- Y) into (X -+ Y) [PR94921]	Jakub Jelinek
	According to my verification proglet, this transformation for signed types with undefined overflow doesn't introduce nor remove any UB cases, so should be valid even for signed integral types. Not using a for because of the :c on plus which can't be there on minus. 2020-05-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94921 * match.pd (~(~X - Y) -> X + Y, ~(~X + Y) -> X - Y): New simplifications. * gcc.dg/tree-ssa/pr94921.c: New test.
2020-05-05	match.pd: Canonicalize (x + (x << cst)) into (x * cst2) [PR94800]	Jakub Jelinek
	The popcount* testcases show yet another creative way to write popcount, but rather than adjusting the popcount matcher to deal with it, I think we just should canonicalize those (X + (X << C) to X * (1 + (1 << C)) and (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2)), because for multiplication we already have simplification rules that can handle nested multiplication (X * CST1 * CST2), while the the shifts and adds we have nothing like that. And user could have written the multiplication anyway, so if we don't emit the fastest or smallest code for the multiplication by constant, we should improve that. At least on the testcases seems the emitted code is reasonable according to cost, except that perhaps we could in some cases try to improve expansion of vector multiplication by uniform constant. 2020-05-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94800 * match.pd (X + (X << C) to X * (1 + (1 << C)), (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2))): New canonicalizations. * gcc.dg/tree-ssa/pr94800.c: New test. * gcc.dg/tree-ssa/popcount5.c: New test. * gcc.dg/tree-ssa/popcount5l.c: New test. * gcc.dg/tree-ssa/popcount5ll.c: New test.
2020-05-05	match.pd: Optimize (((type)A * B) >> prec) != 0 into __imag__ .MUL_OVERFLOW ↵	Jakub Jelinek
	[PR94914] On x86 (the only target with umulv4_optab) one can use mull; seto to check for overflow instead of performing wider multiplication and performing comparison on the high bits. 2020-05-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94914 * match.pd ((((type)A * B) >> prec) != 0 to .MUL_OVERFLOW(A, B) != 0): New simplification. * gcc.target/i386/pr94914.c: New test.
2020-05-04	match.pd: Optimize (x < 0) != (y < 0) into (x ^ y) < 0 [PR94718]	Jakub Jelinek
	The following patch (on top of the two other PR94718 patches) performs the actual optimization requested in the PR. 2020-05-04 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94718 * match.pd ((X < 0) != (Y < 0) into (X ^ Y) < 0): New simplification. * gcc.dg/tree-ssa/pr94718-4.c: New test. * gcc.dg/tree-ssa/pr94718-5.c: New test.
2020-05-04	match.pd: Decrease number of nop conversions around bitwise ops [PR94718]	Jakub Jelinek
	On the following testcase, there are in .optimized dump 14 nop conversions (from signed to unsigned and back), while this patch decreases that number to just 4; for bitwise ops it really doesn't matter if they are performed in signed or unsigned, so the patch (in GIMPLE only, there are some comments about it being undesirable during GENERIC earlier), if it sees both bitop operands nop converted from the same types performs the bitop in their non-converted type and converts the result (i.e. 2 conversions into 1), similarly, if a bitop has one operand nop converted from something, the other not and the result is converted back to the type of the nop converted operand before conversion, it is possible to replace those 2 conversions with just a single conversion of the other operand. 2020-05-04 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94718 match.pd (bitop (convert @0) (convert? @1)): For GIMPLE, if we can, replace two nop conversions on bit_{and,ior,xor} argument and result with just one conversion on the result or another argument. * gcc.dg/tree-ssa/pr94718-3.c: New test.
2020-05-04	match.pd: Move (X & C) eqne (Y & C) -> -> (X ^ Y) & C eqne 0 opt to match.pd ↵	Jakub Jelinek
	[PR94718] This patch moves this optimization from fold-const.c to match.pd where it is actually much shorter to do and lets optimize even code not seen together in a single expression in the source, as the first step towards fixing the PR. 2020-05-04 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94718 * fold-const.c (fold_binary_loc): Move (X & C) eqne (Y & C) -> (X ^ Y) & C eqne 0 optimization to ... * match.pd ((X & C) op (Y & C) into (X ^ Y) & C op 0): ... here. * gcc.dg/tree-ssa/pr94718-1.c: New test. * gcc.dg/tree-ssa/pr94718-2.c: New test.
2020-03-11	fold undefined pointer offsetting	Richard Biener
	This avoids breaking the old broken pointer offsetting via (T)(ptr - ((T)0)->x) which should have used offsetof. Breakage was exposed by the introduction of POINTER_DIFF_EXPR and making PTA not considering that producing a pointer. The mitigation for simple cases is to canonicalize _2 = _1 - 8B; o_9 = (struct obj ) _2; to o_9 = &MEM[_1 + -8B]; eliding one statement and the offending pointer subtraction. 2020-03-11 Richard Biener <rguenther@suse.de> match.pd ((T )(ptr - ptr-cst) -> &MEM[ptr + -ptr-cst]): New pattern. gcc.dg/torture/20200311-1.c: New testcase.
2020-02-15	match.pd: Disallow side-effects in GENERIC for non-COND_EXPR to COND_EXPR ↵	Jakub Jelinek
	simplifications [PR93744] As the following testcases show (the first one reported, last two found by code inspection), we need to disallow side-effects in simplifications that turn some unconditional expression into conditional one. From my little understanding of genmatch.c, it is able to automatically disallow side effects if the same operand is used multiple times in the match pattern, maybe if it is used multiple times in the replacement pattern, and if it is used in conditional contexts in the match pattern, could it be taught to handle this case too? If yes, perhaps just the first hunk could be usable for 8/9 backports (+ the testcases). 2020-02-15 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93744 * match.pd (((m1 >/</>=/<= m2) * d -> (m1 >/</>=/<= m2) ? d : 0, A - ((A - B) & -(C cmp D)) -> (C cmp D) ? B : A, A + ((B - A) & -(C cmp D)) -> (C cmp D) ? B : A): For GENERIC, make sure @2 in the first and @1 in the other patterns has no side-effects. * gcc.c-torture/execute/pr93744-1.c: New test. * gcc.c-torture/execute/pr93744-2.c: New test. * gcc.c-torture/execute/pr93744-3.c: New test.
2020-02-04	tree-optimization/93538 - add missing comparison folding case	Richard Biener
	This adds back a folding that worked in GCC 4.5 times by amending the pattern that handles other cases of address vs. SSA name comparisons. 2020-02-04 Richard Biener <rguenther@suse.de> PR tree-optimization/93538 * match.pd (addr EQ/NE ptr): Amend to handle &ptr->x EQ/NE ptr. * gcc.dg/tree-ssa/forwprop-38.c: New testcase.
2020-01-10	PR90838: Support ctz idioms	Wilco Dijkstra
	Support common idioms for count trailing zeroes using an array lookup. The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic constant which when multiplied by a power of 2 creates a unique value in the top 5 or 6 bits. This is then indexed into a table which maps it to the number of trailing zeroes. When the table is valid, we emit a sequence using the target defined value for ctz (0): int ctz1 (unsigned x) { static const char table[32] = { 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 }; return table[((unsigned)((x & -x) * 0x077CB531U)) >> 27]; } Is optimized to: rbit w0, w0 clz w0, w0 and w0, w0, 31 ret gcc/ PR tree-optimization/90838 * tree-ssa-forwprop.c (check_ctz_array): Add new function. (check_ctz_string): Likewise. (optimize_count_trailing_zeroes): Likewise. (simplify_count_trailing_zeroes): Likewise. (pass_forwprop::execute): Try ctz simplification. * match.pd: Add matching for ctz idioms. testsuite/ PR tree-optimization/90838 * testsuite/gcc.target/aarch64/pr90838.c: New test. From-SVN: r280132
2020-01-07	re PR tree-optimization/93118 (>>32<<32 is not always converted into ↵	Jakub Jelinek
	&~0ffffffffull at the tree level) PR tree-optimization/93118 * match.pd ((x >> c) << c -> x & (-1<<c)): Add nop_convert?. Add new simplifier with two intermediate conversions. * gcc.dg/tree-ssa/pr93118.c: New test. From-SVN: r279950
2020-01-01	Update copyright years.	Jakub Jelinek
	From-SVN: r279813
2020-01-01	re PR tree-optimization/93098 (ICE with negative shifter)	Jakub Jelinek
	PR tree-optimization/93098 * match.pd (popcount): For shift amounts, use integer_onep or wi::to_widest () == cst instead of tree_to_uhwi () == cst tests. Make sure that precision is power of two larger than or equal to 16. Ensure shift is never negative. Use HOST_WIDE_INT_UC macro instead of ULL suffixed constants. Formatting fixes. * gcc.c-torture/compile/pr93098.c: New test. From-SVN: r279809
2019-12-09	re PR tree-optimization/92834 (misssed SLP vectorization in LightPixel)	Jakub Jelinek
	PR tree-optimization/92834 * match.pd (A - ((A - B) & -(C cmp D)) -> (C cmp D) ? B : A, A + ((B - A) & -(C cmp D)) -> (C cmp D) ? B : A): New simplifications. * gcc.dg/tree-ssa/pr92834.c: New test. From-SVN: r279113
2019-12-06	match.pd (nop_convert): Remove empty match.	Richard Biener
	2019-12-06 Richard Biener <rguenther@suse.de> * match.pd (nop_convert): Remove empty match. Use nop_convert? everywhere. From-SVN: r279040
2019-12-06	re PR tree-optimization/92819 (Worse code generated on avx2 due to ↵	Richard Biener
	simplify_vector_constructor) 2019-12-06 Richard Biener <rguenther@suse.de> PR tree-optimization/92819 * match.pd (VEC_PERM_EXPR -> BIT_INSERT_EXPR): Handle inserts into the last lane. For two-element vectors try inserting into the last lane when inserting into the first fails. * gcc.target/i386/pr92819-1.c: New testcase. * gcc.target/i386/pr92803.c: Adjust. From-SVN: r279033
2019-12-05	re PR tree-optimization/92818 (Typo in vec_perm -> bit_insert pattern)	Richard Biener
	2019-12-05 Richard Biener <rguenther@suse.de> PR middle-end/92818 * tree-ssa-forwprop.c (simplify_vector_constructor): Improve heuristics on what don't care element to choose. * match.pd (VEC_PERM_EXPR -> BIT_INSERT_EXPR): Fix typo. * gcc.target/i386/pr92818.c: New testcase. From-SVN: r278998
2019-12-04	re PR tree-optimization/92734 (Missing match.pd simplification done by ↵	Jakub Jelinek
	fold_binary_loc on generic) PR tree-optimization/92734 * match.pd ((A +- B) - A -> +- B, (A +- B) -+ B -> A, A - (A +- B) -> -+ B, A +- (B -+ A) -> +- B): Handle nop_convert. * gcc.dg/tree-ssa/pr92734-2.c: New test. From-SVN: r278958
2019-12-03	re PR tree-optimization/92734 (Missing match.pd simplification done by ↵	Jakub Jelinek
	fold_binary_loc on generic) PR tree-optimization/92734 * match.pd ((CST1 - A) +- CST2 -> CST3 - A, CST1 - (CST2 - A) -> CST3 + A): Handle nop casts around inner subtraction. * gcc.dg/tree-ssa/pr92734.c: New test. From-SVN: r278925
2019-12-02	re PR tree-optimization/92712 (Performance regression with assumed values)	Jakub Jelinek
	PR tree-optimization/92712 * match.pd ((A * B) +- A -> (B +- 1) * A, A +- (A * B) -> (1 +- B) * A): Allow optimizing signed integers even when we don't know anything about range of A, but do know something about range of B and the simplification won't introduce new UB. * gcc.dg/tree-ssa/pr92712-1.c: New test. * gcc.dg/tree-ssa/pr92712-2.c: New test. * gcc.dg/tree-ssa/pr92712-3.c: New test. * gfortran.dg/loop_versioning_1.f90: Adjust expected number of likely to be innermost dimension messages. * gfortran.dg/loop_versioning_10.f90: Likewise. * gfortran.dg/loop_versioning_6.f90: Likewise. From-SVN: r278894
2019-11-05	re PR target/92280 (gcc.target/i386/pr83008.c FAILs)	Richard Biener
	2019-11-05 Richard Biener <rguenther@suse.de> PR tree-optimization/92280 * match.pd (BIT_FIELD_REF of CTOR): Unless the original CTOR had a single use do not create a new CTOR. * tree-ssa-forwprop.c (simplify_bitfield_ref): Do not re-fold BIT_FIELD_REF of a CTOR via GENERIC. From-SVN: r277832
2019-10-08	re PR tree-optimization/90836 (Missing popcount pattern matching)	Dmitrij Pochepko
	2019-10-08 Dmitrij Pochepko <dmitrij.pochepko@bell-sw.com> PR tree-optimization/90836 * gcc/match.pd (popcount): New pattern. From-SVN: r276721
2019-10-05	re PR tree-optimization/91734 (gcc skip an if statement with "-O1 -ffast-math")	Jakub Jelinek
	PR tree-optimization/91734 * generic-match-head.c: Include fold-const-call.h. * match.pd (sqrt(x) cmp c): Check the boundary value and in case inexact computation of cc affects comparison of the boundary, turn LT_EXPR into LE_EXPR, GE_EXPR into GT_EXPR, LE_EXPR into LT_EXPR or GT_EXPR into GE_EXPR. Punt for sqrt comparisons against NaN and for -frounding-math. For c2, try the next smaller or larger floating point constant depending on comparison code and if it has the same sqrt as c2, use it instead of c2. gcc.dg/pr91734.c: New test. From-SVN: r276621
2019-10-04	match.pd (sinh (x) / cosh (x)): New simplification rule.	Rafael Tsuha
	* match.pd (sinh (x) / cosh (x)): New simplification rule. * gcc.dg/sinhovercosh-1.c: New test. From-SVN: r276595
2019-09-24	re PR middle-end/91866 (Sign extend of an int is not recognized)	Jakub Jelinek
	PR middle-end/91866 * match.pd (((T)(A)) + CST -> (T)(A + CST)): Formatting fix. (((T)(A + CST1)) + CST2 -> (T)(A) + (T)CST1 + CST2): New optimization. * gcc.dg/tree-ssa/pr91866.c: New test. From-SVN: r276096
2019-09-16	Rewrite second part of or_comparisons_1 into match.pd.	Martin Liska
	2019-09-16 Martin Liska <mliska@suse.cz> * gimple-fold.c (or_comparisons_1): Remove rules moved to ... * match.pd: ... here. From-SVN: r275752
2019-09-16	Rewrite first part of or_comparisons_1 into match.pd.	Martin Liska
	2019-09-16 Martin Liska <mliska@suse.cz> * gimple-fold.c (or_comparisons_1): Remove rules moved to ... * match.pd: ... here. From-SVN: r275751
2019-09-16	Rewrite part of and_comparisons_1 into match.pd.	Martin Liska
	2019-09-16 Martin Liska <mliska@suse.cz> * genmatch.c (dt_node::append_simplify): Do not print warning when we have duplicate patterns belonging to a same simplify rule. * gimple-fold.c (and_comparisons_1): Remove matching moved to match.pd. (maybe_fold_comparisons_from_match_pd): Handle tcc_comparison as a results. * match.pd: Handle (X == CST1) && (X OP2 CST2) conditions. From-SVN: r275750
2019-09-16	Fix PR88784, middle end is missing some optimizations about unsigned	Li Jia He
	2019-09-16 Li Jia He <helijia@linux.ibm.com> Qi Feng <ffengqi@linux.ibm.com> PR middle-end/88784 * match.pd (x > y && x != XXX_MIN): Optimize into 'x > y'. (x > y && x == XXX_MIN): Optimize into 'false'. (x <= y && x == XXX_MIN): Optimize into 'x == XXX_MIN'. (x < y && x != XXX_MAX): Optimize into 'x < y'. (x < y && x == XXX_MAX): Optimize into 'false'. (x >= y && x == XXX_MAX): Optimize into 'x == XXX_MAX'. (x > y \|\| x != XXX_MIN): Optimize into 'x != XXX_MIN'. (x <= y \|\| x != XXX_MIN): Optimize into 'true'. (x <= y \|\| x == XXX_MIN): Optimize into 'x <= y'. (x < y \|\| x != XXX_MAX): Optimize into 'x != XXX_MAX'. (x >= y \|\| x != XXX_MAX): Optimize into 'true'. (x >= y \|\| x == XXX_MAX): Optimize into 'x >= y'. 2019-09-16 Li Jia He <helijia@linux.ibm.com> Qi Feng <ffengqi@linux.ibm.com> PR middle-end/88784 * gcc.dg/pr88784-1.c: New testcase. * gcc.dg/pr88784-2.c: New testcase. * gcc.dg/pr88784-3.c: New testcase. * gcc.dg/pr88784-4.c: New testcase. * gcc.dg/pr88784-5.c: New testcase. * gcc.dg/pr88784-6.c: New testcase. * gcc.dg/pr88784-7.c: New testcase. * gcc.dg/pr88784-8.c: New testcase. * gcc.dg/pr88784-9.c: New testcase. * gcc.dg/pr88784-10.c: New testcase. * gcc.dg/pr88784-11.c: New testcase. * gcc.dg/pr88784-12.c: New testcase. Co-Authored-By: Qi Feng <ffengqi@linux.ibm.com> From-SVN: r275749
2019-09-11	re PR middle-end/91725 (ICE in get_nonzero_bits starting with r275587)	Jakub Jelinek
	PR middle-end/91725 * match.pd ((A / (1 << B)) -> (A >> B)): Call tree_nonzero_bits instead of get_nonzero_bits, only call it for integral types. * gcc.c-torture/compile/pr91725.c: New test. From-SVN: r275633
2019-09-11	revert: match.pd: Add flag_unsafe_math_optimizations check before deciding ↵	Richard Biener
	on the widest type in... 2019-09-11 Richard Biener <rguenther@suse.de> Revert 2019-09-09 Barnaby Wilks <barnaby.wilks@arm.com> * match.pd: Add flag_unsafe_math_optimizations check before deciding on the widest type in a binary math operation. * gcc.dg/fold-binary-math-casts.c: New test. From-SVN: r275632
2019-09-10	re PR middle-end/91680 (Integer promotion quirk prevents efficient power of ↵	Jakub Jelinek
	2 division) PR middle-end/91680 * match.pd ((A / (1 << B)) -> (A >> B)): Allow widening cast from the shift type to type. * gcc.dg/tree-ssa/pr91680.c: New test. * g++.dg/torture/pr91680.C: New test. From-SVN: r275587
2019-09-09	match.pd: Add flag_unsafe_math_optimizations check before deciding on the ↵	Barnaby Wilks
	widest type in... 2019-09-09 Barnaby Wilks <barnaby.wilks@arm.com> * match.pd: Add flag_unsafe_math_optimizations check before deciding on the widest type in a binary math operation. * gcc.dg/fold-binary-math-casts.c: New test. From-SVN: r275518
2019-09-03	re PR tree-optimization/91504 (Inlining misses some logical operation folding)	Kamlesh Kumar
	PR tree-optimization/91504 * match.pd: Add ((~a & b) ^a) --> (a \| b). PR tree-optimization/91504 gcc.dg/tree-ssa/pr91504.c: New test. From-SVN: r275354
2019-09-02	re PR go/91617 (Many go test case failures after r275026)	Jakub Jelinek
	PR go/91617 * fold-const.c (range_check_type): For enumeral and boolean type, pass 1 to type_for_size langhook instead of TYPE_UNSIGNED (etype). Return unsigned_type_for result whenever etype isn't TYPE_UNSIGNED INTEGER_TYPE. (build_range_check): Don't call unsigned_type_for for pointer types. * match.pd (X / C1 op C2): Don't call unsigned_type_for on range_check_type result. From-SVN: r275299
2019-08-26	[PATCH 2/2] Add simplify rule for wrapped addition.	Robin Dapp
	Add the transform (T)(A) + CST -> (T)(A + CST). This enables vrp to simplify sequences like _2 = a_7 - 1; _3 = (long unsigned int) _2; _5 = _3 + 1 that ivopts creates. -- gcc/ChangeLog: 2019-08-26 Robin Dapp <rdapp@linux.ibm.com> * match.pd: Add (T)(A) + CST -> (T)(A + CST). gcc/testsuite/ChangeLog: 2019-08-26 Robin Dapp <rdapp@linux.ibm.com> * gcc.dg/tree-ssa/copy-headers-5.c: Do not run vrp pass. * gcc.dg/tree-ssa/copy-headers-7.c: Do not run vrp pass. * gcc.dg/tree-ssa/loop-15.c: Remove XFAIL. * gcc.dg/tree-ssa/pr23744.c: Change search pattern. * gcc.dg/wrapped-binop-simplify.c: New test. From-SVN: r274925
2019-08-15	Add support for conditional shifts	Richard Sandiford
	This patch adds support for IFN_COND shifts left and shifts right. This is mostly mechanical, but since we try to handle conditional operations in the same way as unconditional operations in match.pd, we need to support IFN_COND shifts by scalars as well as vectors. E.g.: IFN_COND_SHL (cond, a, { 1, 1, ... }, fallback) and: IFN_COND_SHL (cond, a, 1, fallback) are the same operation, with: (for shiftrotate (lrotate rrotate lshift rshift) ... /* Prefer vector1 << scalar to vector1 << vector2 if vector2 is uniform. / (for vec (VECTOR_CST CONSTRUCTOR) (simplify (shiftrotate @0 vec@1) (with { tree tem = uniform_vector_p (@1); } (if (tem) (shiftrotate @0 { tem; })))))) preferring the latter. The patch copes with this by extending create_convert_operand_from to handle scalar-to-vector conversions. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> gcc/ internal-fn.def (IFN_COND_SHL, IFN_COND_SHR): New internal functions. * internal-fn.c (FOR_EACH_CODE_MAPPING): Handle shifts. * match.pd (UNCOND_BINARY, COND_BINARY): Likewise. * optabs.def (cond_ashl_optab, cond_ashr_optab, cond_lshr_optab): New optabs. * optabs.h (create_convert_operand_from): Expand comment. * optabs.c (maybe_legitimize_operand): Allow implicit broadcasts when mapping scalar rtxes to vector operands. * config/aarch64/iterators.md (SVE_INT_BINARY): Add ashift, ashiftrt and lshiftrt. (sve_int_op, sve_int_op_rev, sve_pred_int_rhs2_operand): Handle them. * config/aarch64/aarch64-sve.md (cond_<optab><mode>_2_const) (cond_<optab><mode>_any_const): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_shift_1.c: New test. * gcc.target/aarch64/sve/cond_shift_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9_run.c: Likewise. Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> From-SVN: r274505
2019-07-26	Add rules to strip away unneeded type casts in expressions	Tamar Christina
	This patch moves part of the type conversion code from convert.c to match.pd because match.pd is able to apply these transformations in the presence of intermediate temporary variables. Concretely it makes both these cases behave the same float e = (float)a * (float)b; c = (_Float16)e; and c = (_Float16)((float)a * (float)b); gcc/ChangeLog: * convert.c (convert_to_real_1): Move part of conversion code... * match.pd: ...To here. gcc/testsuite/ChangeLog: * gcc.dg/type-convert-var.c: New test. From-SVN: r273826
2019-07-24	re PR middle-end/91166 ([SVE] Unfolded ZIPs of constants)	Prathamesh Kulkarni
	2019-07-24 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR middle-end/91166 * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. (define_predicates): Add entry for uniform_vector_p. (vec_same_elem_p): New match pattern. testsuite/ * gcc.target/aarch64/sve/pr91166.c: New test. From-SVN: r273758
2019-07-03	re PR tree-optimization/91069 (Miscompare of 453.povray since r272843)	Richard Biener
	2019-07-03 Richard Biener <rguenther@suse.de> PR middle-end/91069 * match.pd (vec_perm -> bit_insert): Fix element read from first vector. * gcc.dg/pr91069.c: New testcase. From-SVN: r273007
2019-06-11	Allow conversions in X/[ex]4 < Y/[ex]4	Marc Glisse
	2019-06-11 Marc Glisse <marc.glisse@inria.fr> gcc/ * match.pd (X/[ex]4<Y/[ex]4): Handle conversions. gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv-5.c: New file. From-SVN: r272158
2019-06-06	Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).	Martin Liska
	2019-06-06 Martin Liska <mliska@suse.cz> PR tree-optimization/87954 * match.pd: Simplify mult where both arguments are 0 or 1. 2019-06-06 Martin Liska <mliska@suse.cz> PR tree-optimization/87954 * gcc.dg/pr87954.c: New test. From-SVN: r271991
2019-05-31	apply unary op to both sides of (vec_cond x cst1 cst2)	Marc Glisse
	2019-05-31 Marc Glisse <marc.glisse@inria.fr> gcc/ * match.pd (~(vec?cst1:cst2)): New transformation. gcc/testsuite/ * g++.dg/tree-ssa/cprop-vcond.C: New file. From-SVN: r271817
2019-05-31	Simplify more EXACT_DIV_EXPR comparisons	Marc Glisse
	2019-05-31 Marc Glisse <marc.glisse@inria.fr> gcc/ * match.pd (X/[ex]D<Y/[ex]D): Handle negative denominator. ((size_t)(A /[ex] B) CMP C): New transformation. gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv-3.c: New file. * gcc.dg/tree-ssa/cmpexactdiv-4.c: New file. * gcc.dg/Walloca-13.c: Xfail. From-SVN: r271816
2019-05-27	re PR tree-optimization/90610 (526.blender_r miscompared on znver1 with ↵	Richard Biener
	-Ofast -march=native since r271463) 2019-05-27 Richard Biener <rguenther@suse.de> PR middle-end/90610 * match.pd (vec_perm): Avoid clobbering op0 when not generating a bit-insert. From-SVN: r271652
2019-05-21	re PR tree-optimization/90510 (Unnecessary permutation)	Richard Biener
	2019-05-21 Richard Biener <rguenther@suse.de> PR middle-end/90510 * fold-const.c (fold_read_from_vector): New function. * fold-const.h (fold_read_from_vector): Declare. * match.pd (VEC_PERM_EXPR): Build BIT_INSERT_EXPRs for single-element insert permutations. Canonicalize selector further and fix issue with last commit. * gcc.target/i386/pr90510.c: New testcase. From-SVN: r271463