summaryrefslogtreecommitdiff
path: root/test/Transforms
AgeCommit message (Collapse)Author
2017-12-22[InlineCost] Find more free binary operationsHaicheng Wu
Currently, inline cost model considers a binary operator as free only if both its operands are constants. Some simple cases are missing such as a + 0, a - a, etc. This patch modifies visitBinaryOperator() to call SimplifyBinOp() without going through simplifyInstruction() to get rid of the constant restriction. Thus, visitAnd() and visitOr() are not needed. Differential Revision: https://reviews.llvm.org/D41494 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321366 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22inline-fp.ll was moved in r321332; delete it properly.Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321333 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22[Inliner] Restrict soft-float inlining penalty.Eli Friedman
The penalty is currently getting applied in a bunch of places where it doesn't make sense, like bitcasts (which are free) and calls (which were getting the call penalty applied twice). Instead, just apply the penalty to binary operators and floating-point casts. While I'm here, also fix getFPOpCost() to do the right thing in more cases, so we don't have to dig into function attributes. Differential Revision: https://reviews.llvm.org/D41522 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321332 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20[ICP] Expose unconditional call promotion interfaceMatthew Simpson
This patch modifies the indirect call promotion utilities by exposing and using an unconditional call promotion interface. The unconditional promotion interface (i.e., call promotion without creating an if-then-else) can be used if it's known that an indirect call has only one possible callee. The existing conditional promotion interface uses this unconditional interface to promote an indirect call after it has been versioned and placed within the "then" block. A consequence of unconditional promotion is that the fix-up operations for phi nodes in the normal destination of invoke instructions are changed. This is necessary because the existing implementation assumed that an invoke had been versioned, creating a "merge" block where a return value bitcast could be placed. In the new implementation, the edge between a promoted invoke's parent block and its normal destination is split if needed to add a bitcast for the return value. If the invoke is also versioned, the phi node merging the return value of the promoted and original invoke instructions is placed in the "merge" block. Differential Revision: https://reviews.llvm.org/D40751 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321210 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20[PGO] Function section hotness prefix should look at all blocksTeresa Johnson
Summary: The function section prefix for PGO based layout (e.g. hot/unlikely) should look at the hotness of all blocks not just the entry BB. A function with a cold entry but a very hot loop should be placed in the hot section, for example, so that it is located close to other hot functions it may call. For SamplePGO it was already looking at the branch weights on calls, and I made that code conditional on whether this is SamplePGO since it was essentially a noop for instrumentation PGO anyway. Reviewers: davidxl Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D41395 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321197 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20[InstCombine] Add debug location to new caller.Florian Hahn
Reviewers: rnk, aprantl, majnemer Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D414 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321191 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20Revert r320548:[SLP] Vectorize jumbled memory loadsMohammad Shahid
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321181 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20[LV] Remove unnecessary DoExtraAnalysis guard (silent bug)Florian Hahn
canVectorize is only checking if the loop has a normalized pre-header if DoExtraAnalysis is true. This doesn't make sense to me because reporting analysis information shouldn't alter legality checks. This is probably the result of a last minute minor change before committing (?). Patch by Diego Caballero. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D40973 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321172 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20[memcpyopt] Teach memcpyopt to optimize across basic blocksDan Gohman
This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. This is r319482 and r319483, along with fixes for PR35519: fix the optimization that merges stores into memsets to preserve cached memdep info, and fix memdep's non-local caching strategy to not assume that larger queries are always more conservative than smaller ones. Fixes PR28958 and PR35519. Differential Revision: https://reviews.llvm.org/D40802 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321138 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19[InlineCost] Skip volatile loads when looking for repeated loadsHaicheng Wu
This is a follow-up fix of r320814. A test case is also added. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321075 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-19[JumpThreading] Restrict PRE across instructions that don't pass control to ↵Max Kazantsev
successors PRE in JumpThreading should not be able to hoist copy of non-speculable loads across instructions that don't always transfer execution to their successors, otherwise they may introduce an unsafe load which otherwise would not be executed. The same problem for GVN was fixed as rL316975. Differential Revision: https://reviews.llvm.org/D40347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321063 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18[Analysis] Generate more precise TBAA tags when one access encloses the otherIvan A. Kosarev
There are cases when two tags with different base types denote accesses to the same direct or indirect member of a structure type. Currently, merging of such tags results in a tag that represents an access to an object that has the type of that member. This patch changes this so that if one of the accesses encloses the other, then the generic tag is the one of the enclosed access. Differential Revision: https://reviews.llvm.org/D39557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321019 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18[PGO] Fix handling of cold entry count for instrumented PGOTeresa Johnson
Summary: In r277849, getEntryCount was changed to return None when the entry count was 0, specifically for SamplePGO where it means no samples were recorded. However, for instrumentation PGO a 0 entry count should be returned directly, since it does mean that the function was completely cold. Otherwise we end up treating these functions conservatively in isFunctionEntryCold() and isColdBB(). Instead, for SamplePGO use -1 when there are no samples, and change getEntryCount to return None when the value is -1. Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41307 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321018 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18[PGO] add MST min edge selection heuristic to ensure non-zero entry countXinliang David Li
Differential Revision: http://reviews.llvm.org/D41059 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320998 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18[TargetLibraryInfo] Discard library functions with incorrectly sized integersIgor Laevsky
Differential Revision: https://reviews.llvm.org/D41184 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320964 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18[SROA] Disable non-whole-alloca splits by defaultHiroshi Inoue
This patch introduce a switch to control splitting of non-whole-alloca slices with default off. The switch will be default on again after fixing an issue reported in PR35657. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320958 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18[CGP] Fix the handling select inst in complex addressing modeSerguei Katkov
When we put the value in select placeholder we must pass the value through simplification tracker due to the value might be already simplified and erased. This is a fix for PR35658. Reviewers: john.brawn, uabelho Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41251 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320956 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16[InstCombine] Regenerate FMUL/FMA combine tests with update_test_checks.pySimon Pilgrim
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320922 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16[InstCombine] canonicalize shifty abs(): ashr+add+xor --> cmp+neg+selSanjay Patel
We want to do this for 2 reasons: 1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766. 2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern. More detail about what happens in the backend: 1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs into the shift variant. That is the opposite of this IR canonicalization. 2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization. 3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by #1 or #2 into an ISD::ABS node. Note: It would be an efficiency improvement if we had #1 go directly to an ABS node when that's legal/custom. 4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a variety of ways. a. For #2, the vector path is missing the case for setlt with a '1' constant. b. For #3, we are missing a match for commuted versions of the shift variants. 5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the shift sequence when not. 6. In the following examples with this patch applied, we may get conditional moves rather than the shift produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate. define i32 @abs_shifty(i32 %x) { %signbit = ashr i32 %x, 31 %add = add i32 %signbit, %x %abs = xor i32 %signbit, %add ret i32 %abs } define i32 @abs_cmpsubsel(i32 %x) { %cmp = icmp slt i32 %x, zeroinitializer %sub = sub i32 zeroinitializer, %x %abs = select i1 %cmp, i32 %sub, i32 %x ret i32 %abs } define <4 x i32> @abs_shifty_vec(<4 x i32> %x) { %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %add = add <4 x i32> %signbit, %x %abs = xor <4 x i32> %signbit, %add ret <4 x i32> %abs } define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) { %cmp = icmp slt <4 x i32> %x, zeroinitializer %sub = sub <4 x i32> zeroinitializer, %x %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x ret <4 x i32> %abs } > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=x86_64 -mattr=avx > abs_shifty: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_cmpsubsel: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_shifty_vec: > vpabsd %xmm0, %xmm0 > retq > > abs_cmpsubsel_vec: > vpabsd %xmm0, %xmm0 > retq > > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=aarch64 > abs_shifty: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_cmpsubsel: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_shifty_vec: > abs v0.4s, v0.4s > ret > > abs_cmpsubsel_vec: > abs v0.4s, v0.4s > ret > > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=powerpc64le > abs_shifty: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_cmpsubsel: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_shifty_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > > abs_cmpsubsel_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > Differential Revision: https://reviews.llvm.org/D40984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320921 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16Move Transforms/LoopVectorize/consecutive-ptr-cg-bug.ll into the X86 ↵Hal Finkel
subdirectory This test depends on X86's TTI; move into the X86 subdirectory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320914 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16[LV] Extend InstWidening with CM_Widen_RecursiveHal Finkel
Changes to the original scalar loop during LV code gen cause the return value of Legal->isConsecutivePtr() to be inconsistent with the return value during legal/cost phases (further analysis and information of the bug is in D39346). This patch is an alternative fix to PR34965 following the CM_Widen approach proposed by Ayal and Gil in D39346. It extends InstWidening enum with CM_Widen_Reverse to properly record the widening decision for consecutive reverse memory accesses and, consequently, get rid of the Legal->isConsetuviePtr() call in LV code gen. I think this is a simpler/cleaner solution to PR34965 than the one in D39346. Fixes PR34965. Patch by Diego Caballero, thanks! Differential Revision: https://reviews.llvm.org/D40742 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320913 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16[LTO] Make processing of combined module more consistentVitaly Buka
Summary: 1. Use stream 0 only for combined module. Previously if combined module was not processes ThinLTO used the stream for own output. However small changes in input, could trigger combined module and shuffle outputs making life of llvm::LTO harder. 2. Always process combined module and write output to stream 0. Processing empty combined module is cheap and allows llvm::LTO users to avoid implementing processing which is already done in llvm::LTO. Subscribers: mehdi_amini, inglorion, eraman, hiraditya Differential Revision: https://reviews.llvm.org/D41267 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320905 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16[SimplifyLibCalls] Inline calls to cabs when it's safe to do soHal Finkel
When unsafe algerbra is allowed calls to cabs(r) can be replaced by: sqrt(creal(r)*creal(r) + cimag(r)*cimag(r)) Patch by Paul Walker, thanks! Differential Revision: https://reviews.llvm.org/D40069 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320901 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16[ThinLTO] Enable importing of aliases as copy of aliaseeTeresa Johnson
Summary: This implements a missing feature to allow importing of aliases, which was previously disabled because alias cannot be available_externally. We instead import an alias as a copy of its aliasee. Some additional work was required in the IndexBitcodeWriter for the distributed build case, to ensure that the aliasee has a value id in the distributed index file (i.e. even when it is not being imported directly). This is a performance win in codes that have many aliases, e.g. C++ applications that have many constructor and destructor aliases. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D40747 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320895 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15Re-commit : [LICM] Allow sinking when foldable in loopJun Bum Lim
This recommits r320823 reverted due to the test failure in sink-foldable.ll and an unused variable. Added "REQUIRES: aarch64-registered-target" in the test and removed unused variable. Original commit message: Continue trying to sink an instruction if its users in the loop is foldable. This will allow the instruction to be folded in the loop by decoupling it from the user outside of the loop. Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier Reviewed By: hfinkel Subscribers: javed.absar, bmakam, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37076 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320858 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15Revert "Re-commit : [LICM] Allow sinking when foldable in loop"Jun Bum Lim
This reverts commit r320833. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320836 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15Re-commit : [LICM] Allow sinking when foldable in loopJun Bum Lim
This recommit r320823 after fixing a test failure. Original commit message: Continue trying to sink an instruction if its users in the loop is foldable. This will allow the instruction to be folded in the loop by decoupling it from the user outside of the loop. Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier Reviewed By: hfinkel Subscribers: javed.absar, bmakam, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37076 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320833 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15Revert "[LICM] Allow sinking when foldable in loop"Jun Bum Lim
This reverts commit r320823. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320828 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15[LICM] Allow sinking when foldable in loopJun Bum Lim
Summary: Continue trying to sink an instruction if its users in the loop is foldable. This will allow the instruction to be folded in the loop by decoupling it from the user outside of the loop. Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier Reviewed By: hfinkel Subscribers: javed.absar, bmakam, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37076 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320823 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15[InlineCost] Find repeated loads in the calleeHaicheng Wu
SROA analysis of InlineCost can figure out that some stores can be removed after inlining and then the repeated loads clobbered by these stores are also free. This patch finds these clobbered loads and adjust the inline cost accordingly. Differential Revision: https://reviews.llvm.org/D33946 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15[PM] port Rewrite Statepoints For GC to the new pass manager.Fedor Sergeev
Summary: The port is nearly straightforward. The only complication is related to the analyses handling, since one of the analyses used in this module pass is domtree, which is a function analysis. That requires asking for the results of each function and disallows a single interface for run-on-module pass action. Decided to copy-paste the main body of this pass. Most of its code is requesting analyses anyway, so not that much of a copy-paste. The rest of the code movement is to transform all the implementation helper functions like stripNonValidData into non-member statics. Extended all the related LLVM tests with new-pass-manager use. No failures. Reviewers: sanjoy, anna, reames Reviewed By: anna Subscribers: skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D41162 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320796 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15[SCEV] Fix the movement of insertion point in expander. PR35406.Serguei Katkov
We cannot move the insertion point to header if SCEV contains div/rem operations due to they may go over check for zero denominator. Reviewers: sanjoy, mkazantsev, sebpop Reviewed By: sebpop Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41229 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320789 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14[SimplifyCFG] don't sink common insts too soon (PR34603)Sanjay Patel
This should solve: https://bugs.llvm.org/show_bug.cgi?id=34603 ...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run. It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the sinking transform later in the optimization pipeline. Differential Revision: https://reviews.llvm.org/D38566 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320749 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14[SLPVectorizer] Don't ignore scalar extraction instructions of aggregate valueGuozhi Wei
In SLPVectorizer, the vector build instructions (insertvalue for aggregate type) is passed to BoUpSLP.buildTree, it is treated as UserIgnoreList, so later in cost estimation, the cost of these instructions are not counted. For aggregate value, later usage are more likely to be done in scalar registers, either used as individual scalars or used as a whole for function call or return value. Ignore scalar extraction instructions may cause too aggressive vectorization for aggregate values, and slow down performance. So for vectorization of aggregate value, the scalar extraction instructions are required in cost estimation. Differential Revision: https://reviews.llvm.org/D41139 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320736 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14[ScalarEvolution] Fix base condition in isNormalAddRecPHI.Bjorn Pettersson
Summary: The function is meant to recurse until it comes upon the phi it's looking for. However, with the current condition, it will recurse until it finds anything _but_ the phi. The function will even fail for simple cases like: %i = phi i32 [ %inc, %loop ], ... ... %inc = add i32 %i, 1 because the base condition will not happen when the phi is recursed to, and the recursion will end with a 'false' result since the previous instruction is a phi. Reviewers: sanjoy, atrick Reviewed By: sanjoy Subscribers: Ka-Ka, bjope, llvm-commits Committing on behalf of: Bevin Hansson (bevinh) Differential Revision: https://reviews.llvm.org/D40946 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320700 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14[InlineCost] Tracking Values through PHI NodesHaicheng Wu
This patch fix this FIXME in visitPHI() FIXME: We should potentially be tracking values through phi nodes, especially when they collapse to a single value due to deleted CFG edges during inlining. Differential Revision: https://reviews.llvm.org/D38594 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320699 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14Inserting several lit tests to reflect current behaviourOmer Paparo Bivas
Change-Id: I1b8188dc3c6c7c0f455715364ece7d35ef485f2f git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320692 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14[PM][InstCombine] fixing omission of AliasAnalysis in new-pass-manager's ↵Fedor Sergeev
version of InstCombine Summary: Passing AliasAnalysis results instead of nullptr appears to work just fine. A couple new-pass-manager tests updated to align with new order of analyses. Reviewers: chandlerc, spatel, craig.topper Reviewed By: chandlerc Subscribers: mehdi_amini, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D41203 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320687 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14[LV] Support efficient vectorization of an induction with redundant castsDorit Nuzman
D30041 extended SCEVPredicateRewriter to improve handling of Phi nodes whose update chain involves casts; PSCEV can now build an AddRecurrence for some forms of such phi nodes, under the proper runtime overflow test. This means that we can identify such phi nodes as an induction, and the loop-vectorizer can now vectorize such inductions, however inefficiently. The vectorizer doesn't know that it can ignore the casts, and so it vectorizes them. This patch records the casts in the InductionDescriptor, so that they could be marked to be ignored for cost calculation (we use VecValuesToIgnore for that) and ignored for vectorization/widening/scalarization (i.e. treated as TriviallyDead). In addition to marking all these casts to be ignored, we also need to make sure that each cast is mapped to the right vector value in the vector loop body (be it a widened, vectorized, or scalarized induction). So whenever an induction phi is mapped to a vector value (during vectorization/widening/ scalarization), we also map the respective cast instruction (if exists) to that vector value. (If the phi-update sequence of an induction involves more than one cast, then the above mapping to vector value is relevant only for the last cast of the sequence as we allow only the "last cast" to be used outside the induction update chain itself). This is the last step in addressing PR30654. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320672 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13[EarlyCSE] recognize swapped variants of abs/nabs as equivalentSanjay Patel
Extends https://reviews.llvm.org/rL320640 Differential Revision: https://reviews.llvm.org/D41136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320653 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13[EarlyCSE] add tests for swapped abs/nabs; NFCSanjay Patel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320647 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13Reverting [JumpThreading] Preservation of DT and LVI across the passBrian M. Rzycki
Stage 2 bootstrap failed: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/14434 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320641 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13[EarlyCSE] recognize commuted and swapped variants of min/max as equivalent ↵Sanjay Patel
(PR35642) As shown in: https://bugs.llvm.org/show_bug.cgi?id=35642 ...we can have different forms of min/max, so we should recognize those here in EarlyCSE similar to how we already handle binops and compares that can commute. Differential Revision: https://reviews.llvm.org/D41136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320640 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13[JumpThreading] Preservation of DT and LVI across the passBrian M. Rzycki
Summary: See D37528 for a previous (non-deferred) version of this patch and its description. Preserves dominance in a deferred manner using a new class DeferredDominance. This reduces the performance impact of updating the DominatorTree at every edge insertion and deletion. A user may call DDT->flush() within JumpThreading for an up-to-date DT. This patch currently has one flush() at the end of runImpl() to ensure DT is preserved across the pass. LVI is also preserved to help subsequent passes such as CorrelatedValuePropagation. LVI is simpler to maintain and is done immediately (not deferred). The code to perfom the preversation was minimally altered and was simply marked as preserved for the PassManager to be informed. This extends the analysis available to JumpThreading for future enhancements. One example is loop boundary threading. Reviewers: dberlin, kuhar, sebpop Reviewed By: kuhar, sebpop Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40146 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320612 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13[GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases loadAditya Kumar
w.r.t. the paper "A Practical Improvement to the Partial Redundancy Elimination in SSA Form" (https://sites.google.com/site/jongsoopark/home/ssapre.pdf) Proper dominance check was missing here, so having a loopinfo should not be required. Committing this diff as this fixes the bug, if there are further concerns, I'll be happy to work on them. Differential Revision: https://reviews.llvm.org/D39781 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320607 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13Reintroduce r320049, r320014 and r319894.Igor Laevsky
OpenGL issues should be fixed by now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320568 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13[SLP] Vectorize jumbled memory loads.Mohammad Shahid
Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: mgrang, dcaballe, hans, mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320548 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13[CallSiteSplitting] Refactor creating callsites.Florian Hahn
Summary: This change makes the call site creation more general if any of the arguments is predicated on a condition in the call site's predecessors. If we find a callsite, that potentially can be split, we collect the set of conditions for the call site's predecessors (currently only 2 predecessors are allowed). To do that, we traverse each predecessor's predecessors as long as it only has single predecessors and record the condition, if it is relevant to the call site. For each condition, we also check if the condition is taken or not. In case it is not taken, we record the inverse predicate. We use the recorded conditions to create the new call sites and split the basic block. This has 2 benefits: (1) it is slightly easier to see what is going on (IMO) and (2) we can easily extend it to handle more complex control flow. Reviewers: davidxl, junbuml Reviewed By: junbuml Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40728 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320547 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-12[EarlyCSE] add tests for commuted min/max; NFCSanjay Patel
See PR35642: https://bugs.llvm.org/show_bug.cgi?id=35642 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320530 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-12[InstCombine] Fix PR35618: Instcombine hangs on single minmax load bitcast.Alexey Bataev
Summary: If we have pattern `store (load(bitcast(select (cmp(V1, V2), &V1, &V2)))), bitcast)`, but the load is used in other instructions, it leads to looping in InstCombiner. Patch adds additional check that all users of the load instructions are stores and then replaces all uses of load instruction by the new one with new type. Reviewers: RKSimon, spatel, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41072 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320525 91177308-0d34-0410-b5e6-96231b3b80d8