summaryrefslogtreecommitdiff
path: root/gcc/match.pd
AgeCommit message (Collapse)Author
2020-05-08match.pd: A ^ ((A ^ B) & -(C cmp D)) -> (C cmp D) ? B : A simplification ↵Jakub Jelinek
[PR94786] We already have x - ((x - y) & -(z < w)) and x + ((y - x) & -(z < w)) simplifications, this one adds x ^ ((x ^ y) & -(z < w)) (not merged using for because of the :c that can be present on bit_xor and can't on minus). 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94786 * match.pd (A ^ ((A ^ B) & -(C cmp D)) -> (C cmp D) ? B : A): New simplification. * gcc.dg/tree-ssa/pr94786.c: New test.
2020-05-08match.pd: Canonicalize (X + (X >> (prec - 1))) ^ (X >> (prec - 1)) to abs ↵Jakub Jelinek
(X) [PR94783] The following patch canonicalizes M = X >> (prec - 1); (X + M) ^ M for signed integral types into ABS_EXPR (X). For X == min it is already UB because M is -1 and min + -1 is UB, so we can use ABS_EXPR rather than say ABSU_EXPR + cast. The backend might then emit the abs code back using the shift and addition and xor if it is the best sequence for the target, but could do something different that is better. 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94783 * match.pd ((X + (X >> (prec - 1))) ^ (X >> (prec - 1)) to abs (X)): New simplification. * gcc.dg/tree-ssa/pr94783.c: New test.
2020-05-08match.pd: Optimize ffs of known non-zero arg into ctz + 1 [PR94956]Jakub Jelinek
The ffs expanders on several targets (x86, ia64, aarch64 at least) emit a conditional move or similar code to handle the case when the argument is 0, which makes the code longer. If we know from VRP that the argument will not be zero, we can (if the target has also an ctz expander) just use ctz which is undefined at zero and thus the expander doesn't need to deal with that. 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94956 * match.pd (FFS): Optimize __builtin_ffs* of non-zero argument into __builtin_ctz* + 1 if direct IFN_CTZ is supported. * gcc.target/i386/pr94956.c: New test.
2020-05-08match.pd: Simplify unsigned A - B - 1 >= A to B >= A [PR94913]Jakub Jelinek
Implemented thusly. The TYPE_OVERFLOW_WRAPS is there just because the pattern above it has it too, if you want, I can throw it away from both. 2020-05-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94913 * match.pd (A - B + -1 >= A to B >= A): New simplification. (A - B > A to A < B): Don't test TYPE_OVERFLOW_WRAPS which is always true for TYPE_UNSIGNED integral types. * gcc.dg/tree-ssa/pr94913.c: New test.
2020-05-06match.pd: Optimize ~(~X +- Y) into (X -+ Y) [PR94921]Jakub Jelinek
According to my verification proglet, this transformation for signed types with undefined overflow doesn't introduce nor remove any UB cases, so should be valid even for signed integral types. Not using a for because of the :c on plus which can't be there on minus. 2020-05-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94921 * match.pd (~(~X - Y) -> X + Y, ~(~X + Y) -> X - Y): New simplifications. * gcc.dg/tree-ssa/pr94921.c: New test.
2020-05-05match.pd: Canonicalize (x + (x << cst)) into (x * cst2) [PR94800]Jakub Jelinek
The popcount* testcases show yet another creative way to write popcount, but rather than adjusting the popcount matcher to deal with it, I think we just should canonicalize those (X + (X << C) to X * (1 + (1 << C)) and (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2)), because for multiplication we already have simplification rules that can handle nested multiplication (X * CST1 * CST2), while the the shifts and adds we have nothing like that. And user could have written the multiplication anyway, so if we don't emit the fastest or smallest code for the multiplication by constant, we should improve that. At least on the testcases seems the emitted code is reasonable according to cost, except that perhaps we could in some cases try to improve expansion of vector multiplication by uniform constant. 2020-05-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94800 * match.pd (X + (X << C) to X * (1 + (1 << C)), (X << C1) + (X << C2) to X * ((1 << C1) + (1 << C2))): New canonicalizations. * gcc.dg/tree-ssa/pr94800.c: New test. * gcc.dg/tree-ssa/popcount5.c: New test. * gcc.dg/tree-ssa/popcount5l.c: New test. * gcc.dg/tree-ssa/popcount5ll.c: New test.
2020-05-05match.pd: Optimize (((type)A * B) >> prec) != 0 into __imag__ .MUL_OVERFLOW ↵Jakub Jelinek
[PR94914] On x86 (the only target with umulv4_optab) one can use mull; seto to check for overflow instead of performing wider multiplication and performing comparison on the high bits. 2020-05-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94914 * match.pd ((((type)A * B) >> prec) != 0 to .MUL_OVERFLOW(A, B) != 0): New simplification. * gcc.target/i386/pr94914.c: New test.
2020-05-04match.pd: Optimize (x < 0) != (y < 0) into (x ^ y) < 0 [PR94718]Jakub Jelinek
The following patch (on top of the two other PR94718 patches) performs the actual optimization requested in the PR. 2020-05-04 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94718 * match.pd ((X < 0) != (Y < 0) into (X ^ Y) < 0): New simplification. * gcc.dg/tree-ssa/pr94718-4.c: New test. * gcc.dg/tree-ssa/pr94718-5.c: New test.
2020-05-04match.pd: Decrease number of nop conversions around bitwise ops [PR94718]Jakub Jelinek
On the following testcase, there are in *.optimized dump 14 nop conversions (from signed to unsigned and back), while this patch decreases that number to just 4; for bitwise ops it really doesn't matter if they are performed in signed or unsigned, so the patch (in GIMPLE only, there are some comments about it being undesirable during GENERIC earlier), if it sees both bitop operands nop converted from the same types performs the bitop in their non-converted type and converts the result (i.e. 2 conversions into 1), similarly, if a bitop has one operand nop converted from something, the other not and the result is converted back to the type of the nop converted operand before conversion, it is possible to replace those 2 conversions with just a single conversion of the other operand. 2020-05-04 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94718 * match.pd (bitop (convert @0) (convert? @1)): For GIMPLE, if we can, replace two nop conversions on bit_{and,ior,xor} argument and result with just one conversion on the result or another argument. * gcc.dg/tree-ssa/pr94718-3.c: New test.
2020-05-04match.pd: Move (X & C) eqne (Y & C) -> -> (X ^ Y) & C eqne 0 opt to match.pd ↵Jakub Jelinek
[PR94718] This patch moves this optimization from fold-const.c to match.pd where it is actually much shorter to do and lets optimize even code not seen together in a single expression in the source, as the first step towards fixing the PR. 2020-05-04 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94718 * fold-const.c (fold_binary_loc): Move (X & C) eqne (Y & C) -> (X ^ Y) & C eqne 0 optimization to ... * match.pd ((X & C) op (Y & C) into (X ^ Y) & C op 0): ... here. * gcc.dg/tree-ssa/pr94718-1.c: New test. * gcc.dg/tree-ssa/pr94718-2.c: New test.
2020-03-11fold undefined pointer offsettingRichard Biener
This avoids breaking the old broken pointer offsetting via (T)(ptr - ((T)0)->x) which should have used offsetof. Breakage was exposed by the introduction of POINTER_DIFF_EXPR and making PTA not considering that producing a pointer. The mitigation for simple cases is to canonicalize _2 = _1 - 8B; o_9 = (struct obj *) _2; to o_9 = &MEM[_1 + -8B]; eliding one statement and the offending pointer subtraction. 2020-03-11 Richard Biener <rguenther@suse.de> * match.pd ((T *)(ptr - ptr-cst) -> &MEM[ptr + -ptr-cst]): New pattern. * gcc.dg/torture/20200311-1.c: New testcase.
2020-02-15match.pd: Disallow side-effects in GENERIC for non-COND_EXPR to COND_EXPR ↵Jakub Jelinek
simplifications [PR93744] As the following testcases show (the first one reported, last two found by code inspection), we need to disallow side-effects in simplifications that turn some unconditional expression into conditional one. From my little understanding of genmatch.c, it is able to automatically disallow side effects if the same operand is used multiple times in the match pattern, maybe if it is used multiple times in the replacement pattern, and if it is used in conditional contexts in the match pattern, could it be taught to handle this case too? If yes, perhaps just the first hunk could be usable for 8/9 backports (+ the testcases). 2020-02-15 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93744 * match.pd (((m1 >/</>=/<= m2) * d -> (m1 >/</>=/<= m2) ? d : 0, A - ((A - B) & -(C cmp D)) -> (C cmp D) ? B : A, A + ((B - A) & -(C cmp D)) -> (C cmp D) ? B : A): For GENERIC, make sure @2 in the first and @1 in the other patterns has no side-effects. * gcc.c-torture/execute/pr93744-1.c: New test. * gcc.c-torture/execute/pr93744-2.c: New test. * gcc.c-torture/execute/pr93744-3.c: New test.
2020-02-04tree-optimization/93538 - add missing comparison folding caseRichard Biener
This adds back a folding that worked in GCC 4.5 times by amending the pattern that handles other cases of address vs. SSA name comparisons. 2020-02-04 Richard Biener <rguenther@suse.de> PR tree-optimization/93538 * match.pd (addr EQ/NE ptr): Amend to handle &ptr->x EQ/NE ptr. * gcc.dg/tree-ssa/forwprop-38.c: New testcase.
2020-01-10PR90838: Support ctz idiomsWilco Dijkstra
Support common idioms for count trailing zeroes using an array lookup. The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic constant which when multiplied by a power of 2 creates a unique value in the top 5 or 6 bits. This is then indexed into a table which maps it to the number of trailing zeroes. When the table is valid, we emit a sequence using the target defined value for ctz (0): int ctz1 (unsigned x) { static const char table[32] = { 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 }; return table[((unsigned)((x & -x) * 0x077CB531U)) >> 27]; } Is optimized to: rbit w0, w0 clz w0, w0 and w0, w0, 31 ret gcc/ PR tree-optimization/90838 * tree-ssa-forwprop.c (check_ctz_array): Add new function. (check_ctz_string): Likewise. (optimize_count_trailing_zeroes): Likewise. (simplify_count_trailing_zeroes): Likewise. (pass_forwprop::execute): Try ctz simplification. * match.pd: Add matching for ctz idioms. testsuite/ PR tree-optimization/90838 * testsuite/gcc.target/aarch64/pr90838.c: New test. From-SVN: r280132
2020-01-07re PR tree-optimization/93118 (>>32<<32 is not always converted into ↵Jakub Jelinek
&~0ffffffffull at the tree level) PR tree-optimization/93118 * match.pd ((x >> c) << c -> x & (-1<<c)): Add nop_convert?. Add new simplifier with two intermediate conversions. * gcc.dg/tree-ssa/pr93118.c: New test. From-SVN: r279950
2020-01-01Update copyright years.Jakub Jelinek
From-SVN: r279813
2020-01-01re PR tree-optimization/93098 (ICE with negative shifter)Jakub Jelinek
PR tree-optimization/93098 * match.pd (popcount): For shift amounts, use integer_onep or wi::to_widest () == cst instead of tree_to_uhwi () == cst tests. Make sure that precision is power of two larger than or equal to 16. Ensure shift is never negative. Use HOST_WIDE_INT_UC macro instead of ULL suffixed constants. Formatting fixes. * gcc.c-torture/compile/pr93098.c: New test. From-SVN: r279809
2019-12-09re PR tree-optimization/92834 (misssed SLP vectorization in LightPixel)Jakub Jelinek
PR tree-optimization/92834 * match.pd (A - ((A - B) & -(C cmp D)) -> (C cmp D) ? B : A, A + ((B - A) & -(C cmp D)) -> (C cmp D) ? B : A): New simplifications. * gcc.dg/tree-ssa/pr92834.c: New test. From-SVN: r279113
2019-12-06match.pd (nop_convert): Remove empty match.Richard Biener
2019-12-06 Richard Biener <rguenther@suse.de> * match.pd (nop_convert): Remove empty match. Use nop_convert? everywhere. From-SVN: r279040
2019-12-06re PR tree-optimization/92819 (Worse code generated on avx2 due to ↵Richard Biener
simplify_vector_constructor) 2019-12-06 Richard Biener <rguenther@suse.de> PR tree-optimization/92819 * match.pd (VEC_PERM_EXPR -> BIT_INSERT_EXPR): Handle inserts into the last lane. For two-element vectors try inserting into the last lane when inserting into the first fails. * gcc.target/i386/pr92819-1.c: New testcase. * gcc.target/i386/pr92803.c: Adjust. From-SVN: r279033
2019-12-05re PR tree-optimization/92818 (Typo in vec_perm -> bit_insert pattern)Richard Biener
2019-12-05 Richard Biener <rguenther@suse.de> PR middle-end/92818 * tree-ssa-forwprop.c (simplify_vector_constructor): Improve heuristics on what don't care element to choose. * match.pd (VEC_PERM_EXPR -> BIT_INSERT_EXPR): Fix typo. * gcc.target/i386/pr92818.c: New testcase. From-SVN: r278998
2019-12-04re PR tree-optimization/92734 (Missing match.pd simplification done by ↵Jakub Jelinek
fold_binary_loc on generic) PR tree-optimization/92734 * match.pd ((A +- B) - A -> +- B, (A +- B) -+ B -> A, A - (A +- B) -> -+ B, A +- (B -+ A) -> +- B): Handle nop_convert. * gcc.dg/tree-ssa/pr92734-2.c: New test. From-SVN: r278958
2019-12-03re PR tree-optimization/92734 (Missing match.pd simplification done by ↵Jakub Jelinek
fold_binary_loc on generic) PR tree-optimization/92734 * match.pd ((CST1 - A) +- CST2 -> CST3 - A, CST1 - (CST2 - A) -> CST3 + A): Handle nop casts around inner subtraction. * gcc.dg/tree-ssa/pr92734.c: New test. From-SVN: r278925
2019-12-02re PR tree-optimization/92712 (Performance regression with assumed values)Jakub Jelinek
PR tree-optimization/92712 * match.pd ((A * B) +- A -> (B +- 1) * A, A +- (A * B) -> (1 +- B) * A): Allow optimizing signed integers even when we don't know anything about range of A, but do know something about range of B and the simplification won't introduce new UB. * gcc.dg/tree-ssa/pr92712-1.c: New test. * gcc.dg/tree-ssa/pr92712-2.c: New test. * gcc.dg/tree-ssa/pr92712-3.c: New test. * gfortran.dg/loop_versioning_1.f90: Adjust expected number of likely to be innermost dimension messages. * gfortran.dg/loop_versioning_10.f90: Likewise. * gfortran.dg/loop_versioning_6.f90: Likewise. From-SVN: r278894
2019-11-05re PR target/92280 (gcc.target/i386/pr83008.c FAILs)Richard Biener
2019-11-05 Richard Biener <rguenther@suse.de> PR tree-optimization/92280 * match.pd (BIT_FIELD_REF of CTOR): Unless the original CTOR had a single use do not create a new CTOR. * tree-ssa-forwprop.c (simplify_bitfield_ref): Do not re-fold BIT_FIELD_REF of a CTOR via GENERIC. From-SVN: r277832
2019-10-08re PR tree-optimization/90836 (Missing popcount pattern matching)Dmitrij Pochepko
2019-10-08 Dmitrij Pochepko <dmitrij.pochepko@bell-sw.com> PR tree-optimization/90836 * gcc/match.pd (popcount): New pattern. From-SVN: r276721
2019-10-05re PR tree-optimization/91734 (gcc skip an if statement with "-O1 -ffast-math")Jakub Jelinek
PR tree-optimization/91734 * generic-match-head.c: Include fold-const-call.h. * match.pd (sqrt(x) cmp c): Check the boundary value and in case inexact computation of c*c affects comparison of the boundary, turn LT_EXPR into LE_EXPR, GE_EXPR into GT_EXPR, LE_EXPR into LT_EXPR or GT_EXPR into GE_EXPR. Punt for sqrt comparisons against NaN and for -frounding-math. For c2, try the next smaller or larger floating point constant depending on comparison code and if it has the same sqrt as c2, use it instead of c2. * gcc.dg/pr91734.c: New test. From-SVN: r276621
2019-10-04match.pd (sinh (x) / cosh (x)): New simplification rule.Rafael Tsuha
* match.pd (sinh (x) / cosh (x)): New simplification rule. * gcc.dg/sinhovercosh-1.c: New test. From-SVN: r276595
2019-09-24re PR middle-end/91866 (Sign extend of an int is not recognized)Jakub Jelinek
PR middle-end/91866 * match.pd (((T)(A)) + CST -> (T)(A + CST)): Formatting fix. (((T)(A + CST1)) + CST2 -> (T)(A) + (T)CST1 + CST2): New optimization. * gcc.dg/tree-ssa/pr91866.c: New test. From-SVN: r276096
2019-09-16Rewrite second part of or_comparisons_1 into match.pd.Martin Liska
2019-09-16 Martin Liska <mliska@suse.cz> * gimple-fold.c (or_comparisons_1): Remove rules moved to ... * match.pd: ... here. From-SVN: r275752
2019-09-16Rewrite first part of or_comparisons_1 into match.pd.Martin Liska
2019-09-16 Martin Liska <mliska@suse.cz> * gimple-fold.c (or_comparisons_1): Remove rules moved to ... * match.pd: ... here. From-SVN: r275751
2019-09-16Rewrite part of and_comparisons_1 into match.pd.Martin Liska
2019-09-16 Martin Liska <mliska@suse.cz> * genmatch.c (dt_node::append_simplify): Do not print warning when we have duplicate patterns belonging to a same simplify rule. * gimple-fold.c (and_comparisons_1): Remove matching moved to match.pd. (maybe_fold_comparisons_from_match_pd): Handle tcc_comparison as a results. * match.pd: Handle (X == CST1) && (X OP2 CST2) conditions. From-SVN: r275750
2019-09-16Fix PR88784, middle end is missing some optimizations about unsignedLi Jia He
2019-09-16 Li Jia He <helijia@linux.ibm.com> Qi Feng <ffengqi@linux.ibm.com> PR middle-end/88784 * match.pd (x > y && x != XXX_MIN): Optimize into 'x > y'. (x > y && x == XXX_MIN): Optimize into 'false'. (x <= y && x == XXX_MIN): Optimize into 'x == XXX_MIN'. (x < y && x != XXX_MAX): Optimize into 'x < y'. (x < y && x == XXX_MAX): Optimize into 'false'. (x >= y && x == XXX_MAX): Optimize into 'x == XXX_MAX'. (x > y || x != XXX_MIN): Optimize into 'x != XXX_MIN'. (x <= y || x != XXX_MIN): Optimize into 'true'. (x <= y || x == XXX_MIN): Optimize into 'x <= y'. (x < y || x != XXX_MAX): Optimize into 'x != XXX_MAX'. (x >= y || x != XXX_MAX): Optimize into 'true'. (x >= y || x == XXX_MAX): Optimize into 'x >= y'. 2019-09-16 Li Jia He <helijia@linux.ibm.com> Qi Feng <ffengqi@linux.ibm.com> PR middle-end/88784 * gcc.dg/pr88784-1.c: New testcase. * gcc.dg/pr88784-2.c: New testcase. * gcc.dg/pr88784-3.c: New testcase. * gcc.dg/pr88784-4.c: New testcase. * gcc.dg/pr88784-5.c: New testcase. * gcc.dg/pr88784-6.c: New testcase. * gcc.dg/pr88784-7.c: New testcase. * gcc.dg/pr88784-8.c: New testcase. * gcc.dg/pr88784-9.c: New testcase. * gcc.dg/pr88784-10.c: New testcase. * gcc.dg/pr88784-11.c: New testcase. * gcc.dg/pr88784-12.c: New testcase. Co-Authored-By: Qi Feng <ffengqi@linux.ibm.com> From-SVN: r275749
2019-09-11re PR middle-end/91725 (ICE in get_nonzero_bits starting with r275587)Jakub Jelinek
PR middle-end/91725 * match.pd ((A / (1 << B)) -> (A >> B)): Call tree_nonzero_bits instead of get_nonzero_bits, only call it for integral types. * gcc.c-torture/compile/pr91725.c: New test. From-SVN: r275633
2019-09-11revert: match.pd: Add flag_unsafe_math_optimizations check before deciding ↵Richard Biener
on the widest type in... 2019-09-11 Richard Biener <rguenther@suse.de> Revert 2019-09-09 Barnaby Wilks <barnaby.wilks@arm.com> * match.pd: Add flag_unsafe_math_optimizations check before deciding on the widest type in a binary math operation. * gcc.dg/fold-binary-math-casts.c: New test. From-SVN: r275632
2019-09-10re PR middle-end/91680 (Integer promotion quirk prevents efficient power of ↵Jakub Jelinek
2 division) PR middle-end/91680 * match.pd ((A / (1 << B)) -> (A >> B)): Allow widening cast from the shift type to type. * gcc.dg/tree-ssa/pr91680.c: New test. * g++.dg/torture/pr91680.C: New test. From-SVN: r275587
2019-09-09match.pd: Add flag_unsafe_math_optimizations check before deciding on the ↵Barnaby Wilks
widest type in... 2019-09-09 Barnaby Wilks <barnaby.wilks@arm.com> * match.pd: Add flag_unsafe_math_optimizations check before deciding on the widest type in a binary math operation. * gcc.dg/fold-binary-math-casts.c: New test. From-SVN: r275518
2019-09-03re PR tree-optimization/91504 (Inlining misses some logical operation folding)Kamlesh Kumar
PR tree-optimization/91504 * match.pd: Add ((~a & b) ^a) --> (a | b). PR tree-optimization/91504 gcc.dg/tree-ssa/pr91504.c: New test. From-SVN: r275354
2019-09-02re PR go/91617 (Many go test case failures after r275026)Jakub Jelinek
PR go/91617 * fold-const.c (range_check_type): For enumeral and boolean type, pass 1 to type_for_size langhook instead of TYPE_UNSIGNED (etype). Return unsigned_type_for result whenever etype isn't TYPE_UNSIGNED INTEGER_TYPE. (build_range_check): Don't call unsigned_type_for for pointer types. * match.pd (X / C1 op C2): Don't call unsigned_type_for on range_check_type result. From-SVN: r275299
2019-08-26[PATCH 2/2] Add simplify rule for wrapped addition.Robin Dapp
Add the transform (T)(A) + CST -> (T)(A + CST). This enables vrp to simplify sequences like _2 = a_7 - 1; _3 = (long unsigned int) _2; _5 = _3 + 1 that ivopts creates. -- gcc/ChangeLog: 2019-08-26 Robin Dapp <rdapp@linux.ibm.com> * match.pd: Add (T)(A) + CST -> (T)(A + CST). gcc/testsuite/ChangeLog: 2019-08-26 Robin Dapp <rdapp@linux.ibm.com> * gcc.dg/tree-ssa/copy-headers-5.c: Do not run vrp pass. * gcc.dg/tree-ssa/copy-headers-7.c: Do not run vrp pass. * gcc.dg/tree-ssa/loop-15.c: Remove XFAIL. * gcc.dg/tree-ssa/pr23744.c: Change search pattern. * gcc.dg/wrapped-binop-simplify.c: New test. From-SVN: r274925
2019-08-15Add support for conditional shiftsRichard Sandiford
This patch adds support for IFN_COND shifts left and shifts right. This is mostly mechanical, but since we try to handle conditional operations in the same way as unconditional operations in match.pd, we need to support IFN_COND shifts by scalars as well as vectors. E.g.: IFN_COND_SHL (cond, a, { 1, 1, ... }, fallback) and: IFN_COND_SHL (cond, a, 1, fallback) are the same operation, with: (for shiftrotate (lrotate rrotate lshift rshift) ... /* Prefer vector1 << scalar to vector1 << vector2 if vector2 is uniform. */ (for vec (VECTOR_CST CONSTRUCTOR) (simplify (shiftrotate @0 vec@1) (with { tree tem = uniform_vector_p (@1); } (if (tem) (shiftrotate @0 { tem; })))))) preferring the latter. The patch copes with this by extending create_convert_operand_from to handle scalar-to-vector conversions. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> gcc/ * internal-fn.def (IFN_COND_SHL, IFN_COND_SHR): New internal functions. * internal-fn.c (FOR_EACH_CODE_MAPPING): Handle shifts. * match.pd (UNCOND_BINARY, COND_BINARY): Likewise. * optabs.def (cond_ashl_optab, cond_ashr_optab, cond_lshr_optab): New optabs. * optabs.h (create_convert_operand_from): Expand comment. * optabs.c (maybe_legitimize_operand): Allow implicit broadcasts when mapping scalar rtxes to vector operands. * config/aarch64/iterators.md (SVE_INT_BINARY): Add ashift, ashiftrt and lshiftrt. (sve_int_op, sve_int_op_rev, sve_pred_int_rhs2_operand): Handle them. * config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_const) (*cond_<optab><mode>_any_const): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_shift_1.c: New test. * gcc.target/aarch64/sve/cond_shift_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9_run.c: Likewise. Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> From-SVN: r274505
2019-07-26Add rules to strip away unneeded type casts in expressionsTamar Christina
This patch moves part of the type conversion code from convert.c to match.pd because match.pd is able to apply these transformations in the presence of intermediate temporary variables. Concretely it makes both these cases behave the same float e = (float)a * (float)b; *c = (_Float16)e; and *c = (_Float16)((float)a * (float)b); gcc/ChangeLog: * convert.c (convert_to_real_1): Move part of conversion code... * match.pd: ...To here. gcc/testsuite/ChangeLog: * gcc.dg/type-convert-var.c: New test. From-SVN: r273826
2019-07-24re PR middle-end/91166 ([SVE] Unfolded ZIPs of constants)Prathamesh Kulkarni
2019-07-24 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR middle-end/91166 * match.pd (vec_perm_expr(v, v, mask) -> v): New pattern. (define_predicates): Add entry for uniform_vector_p. (vec_same_elem_p): New match pattern. testsuite/ * gcc.target/aarch64/sve/pr91166.c: New test. From-SVN: r273758
2019-07-03re PR tree-optimization/91069 (Miscompare of 453.povray since r272843)Richard Biener
2019-07-03 Richard Biener <rguenther@suse.de> PR middle-end/91069 * match.pd (vec_perm -> bit_insert): Fix element read from first vector. * gcc.dg/pr91069.c: New testcase. From-SVN: r273007
2019-06-11Allow conversions in X/[ex]4 < Y/[ex]4Marc Glisse
2019-06-11 Marc Glisse <marc.glisse@inria.fr> gcc/ * match.pd (X/[ex]4<Y/[ex]4): Handle conversions. gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv-5.c: New file. From-SVN: r272158
2019-06-06Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).Martin Liska
2019-06-06 Martin Liska <mliska@suse.cz> PR tree-optimization/87954 * match.pd: Simplify mult where both arguments are 0 or 1. 2019-06-06 Martin Liska <mliska@suse.cz> PR tree-optimization/87954 * gcc.dg/pr87954.c: New test. From-SVN: r271991
2019-05-31apply unary op to both sides of (vec_cond x cst1 cst2)Marc Glisse
2019-05-31 Marc Glisse <marc.glisse@inria.fr> gcc/ * match.pd (~(vec?cst1:cst2)): New transformation. gcc/testsuite/ * g++.dg/tree-ssa/cprop-vcond.C: New file. From-SVN: r271817
2019-05-31Simplify more EXACT_DIV_EXPR comparisonsMarc Glisse
2019-05-31 Marc Glisse <marc.glisse@inria.fr> gcc/ * match.pd (X/[ex]D<Y/[ex]D): Handle negative denominator. ((size_t)(A /[ex] B) CMP C): New transformation. gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv-3.c: New file. * gcc.dg/tree-ssa/cmpexactdiv-4.c: New file. * gcc.dg/Walloca-13.c: Xfail. From-SVN: r271816
2019-05-27re PR tree-optimization/90610 (526.blender_r miscompared on znver1 with ↵Richard Biener
-Ofast -march=native since r271463) 2019-05-27 Richard Biener <rguenther@suse.de> PR middle-end/90610 * match.pd (vec_perm): Avoid clobbering op0 when not generating a bit-insert. From-SVN: r271652
2019-05-21re PR tree-optimization/90510 (Unnecessary permutation)Richard Biener
2019-05-21 Richard Biener <rguenther@suse.de> PR middle-end/90510 * fold-const.c (fold_read_from_vector): New function. * fold-const.h (fold_read_from_vector): Declare. * match.pd (VEC_PERM_EXPR): Build BIT_INSERT_EXPRs for single-element insert permutations. Canonicalize selector further and fix issue with last commit. * gcc.target/i386/pr90510.c: New testcase. From-SVN: r271463