ampere-computing/llvm.git - LLVM including Ampere Computing toolchain specific patches

Age	Commit message (Collapse)	Author
2018-04-11	Merging r326521:	Tom Stellard
	------------------------------------------------------------------------ r326521 \| indutny \| 2018-03-01 16:59:27 -0800 (Thu, 01 Mar 2018) \| 13 lines [ArgumentPromotion] don't break musttail invariant PR36543 Summary: Do not break musttail invariant by promoting arguments of musttail callee or caller. Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv, fhahn, rnk Reviewed By: rnk Subscribers: rnk, llvm-commits Differential Revision: https://reviews.llvm.org/D43926 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@329858 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-11	Add missing test file from r329855	Tom Stellard
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@329857 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-11	Backport of rL326666 and rL326668 for PR36607 and PR36608.	Florian Hahn
	[CallSiteSplitting] properly split musttail calls. The original author was Fedor Indutny <fedor@indutny.com>. `musttail` calls can't be naively splitted. The split blocks must include not only the call instruction itself, but also (optional) `bitcast` and `return` instructions that follow it. Clone `bitcast` and `ret`, place them into the split blocks, and remove the tail block when done. Reviewers: junbuml, mcrosier, davidxl, davide, fhahn Reviewed By: fhahn Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D43729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@329793 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-09	Merging r326376:	Tom Stellard
	------------------------------------------------------------------------ r326376 \| jdevlieghere \| 2018-02-28 14:28:44 -0800 (Wed, 28 Feb 2018) \| 12 lines [GlobalOpt] don't change CC of musttail calle(e\|r) When the function has musttail call - its cc is fixed to be equal to the cc of the musttail callee. In such case (and in the case of the musttail callee), GlobalOpt should not change the cc to fastcc as it will break the invariant. This fixes PR36546 Patch by: Fedor Indutny (indutny) Differential revision: https://reviews.llvm.org/D43859 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@329634 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-22	Merging r325687:	Hans Wennborg
	------------------------------------------------------------------------ r325687 \| sbaranga \| 2018-02-21 16:20:32 +0100 (Wed, 21 Feb 2018) \| 8 lines [SCEV] Temporarily disable loop versioning for the purpose of turning SCEVUnknowns of PHIs into AddRecExprs. This feature is now hidden behind the -scev-version-unknown flag. Fixes PR36032 and PR35432. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@325773 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-19	Merging r324195:	Hans Wennborg
	------------------------------------------------------------------------ r324195 \| mcrosier \| 2018-02-04 16:42:24 +0100 (Sun, 04 Feb 2018) \| 12 lines [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking The type-shrinking logic in reduction detection, although narrow in scope, is also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies the approach to rely on the demanded bits and value tracking analyses, if available. We currently perform type-shrinking separately for reductions and other instructions in the loop. Long-term, we should probably think about computing minimal bit widths in a more complete way for the loops we want to vectorize. PR35734 Differential Revision: https://reviews.llvm.org/D42309 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@325508 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-19	Merging r324916:	Hans Wennborg
	------------------------------------------------------------------------ r324916 \| junbuml \| 2018-02-12 18:56:55 +0100 (Mon, 12 Feb 2018) \| 7 lines [LICM] update BlockColors after splitting predecessors Update BlockColors after splitting predecessors. Do not allow splitting EHPad for sinking when the BlockColors is not empty, so we can simply assign predecessor's color to the new block. Fixes PR36184 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@325507 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-19	Merging r325148:	Hans Wennborg
	------------------------------------------------------------------------ r325148 \| ctopper \| 2018-02-14 19:08:33 +0100 (Wed, 14 Feb 2018) \| 7 lines [InstCombine] Don't fold select(C, Z, binop(select(C, X, Y), W)) -> select(C, Z, binop(Y, W)) if the binop is rem or div. The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe. Fixes PR36362. Differential Revision: https://reviews.llvm.org/D43276 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@325501 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-02	Merging r323908:	Hans Wennborg
	------------------------------------------------------------------------ r323908 \| mareko \| 2018-01-31 21:18:04 +0100 (Wed, 31 Jan 2018) \| 7 lines AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16} Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@324103 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-02	Merging r323907 and r323913:	Hans Wennborg
	------------------------------------------------------------------------ r323907 \| mareko \| 2018-01-31 21:17:52 +0100 (Wed, 31 Jan 2018) \| 11 lines [SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPs Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r323913 \| mareko \| 2018-01-31 21:49:19 +0100 (Wed, 31 Jan 2018) \| 1 line [SeparateConstOffsetFromGEP] Fix up addrspace in the AMDGPU test ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@324088 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-02	Merging r323759:	Hans Wennborg
	------------------------------------------------------------------------ r323759 \| spatel \| 2018-01-30 14:53:59 +0100 (Tue, 30 Jan 2018) \| 10 lines [DSE] make sure memory is not modified before partial store merging (PR36129) We missed a critical check in D30703. We must make sure that no intermediate store is sitting between the stores that we want to merge. This should fix: https://bugs.llvm.org/show_bug.cgi?id=36129 Differential Revision: https://reviews.llvm.org/D42663 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@324086 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-02	Merging r323155:	Hans Wennborg
	------------------------------------------------------------------------ r323155 \| chandlerc \| 2018-01-22 23:05:25 +0100 (Mon, 22 Jan 2018) \| 133 lines Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took in the speculative execution, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` in addition to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile all libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we strongly recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@324067 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-30	Merging r323515:	Hans Wennborg
	------------------------------------------------------------------------ r323515 \| fhahn \| 2018-01-26 11:36:50 +0100 (Fri, 26 Jan 2018) \| 7 lines [CallSiteSplitting] Fix infinite loop when recording conditions. Fix infinite loop when recording conditions by correctly marking basic blocks as visited. Fixes https://bugs.llvm.org/show_bug.cgi?id=36105 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@323771 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-30	Merging r323355:	Hans Wennborg
	------------------------------------------------------------------------ r323355 \| nha \| 2018-01-24 19:02:05 +0100 (Wed, 24 Jan 2018) \| 9 lines Revert r321751, "StructurizeCFG: Fix broken backedge detection" It causes regressions in various OpenGL test suites. Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression. Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@323749 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-30	Merging r322016:	Hans Wennborg
	------------------------------------------------------------------------ r322016 \| spatel \| 2018-01-08 19:31:13 +0100 (Mon, 08 Jan 2018) \| 8 lines [ValueTracking] remove overzealous assert The test is derived from a failing fuzz test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5008 Credit to @rksimon for pointing out the problem. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@323740 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-30	Revert r323738; that was not the one I wanted to merge	Hans Wennborg
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@323739 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-30	Merging r322006:	Hans Wennborg
	------------------------------------------------------------------------ r322006 \| davide \| 2018-01-08 17:34:06 +0100 (Mon, 08 Jan 2018) \| 19 lines [CVP] Replace incoming values from unreachable blocks with undef. This is an attempt of fixing PR35807. Due to the non-standard definition of dominance in LLVM, where uses in unreachable blocks are dominated by anything, you can have, in an unreachable block: %patatino = OP1 %patatino, CONSTANT When `SimplifyInstruction` receives a PHI where an incoming value is of the aforementioned form, in some cases, loops indefinitely. What I propose here instead is keeping track of the incoming values from unreachable blocks, and replacing them with undef. It fixes this case, and it seems to be good regardless (even if we can't prove that the value is constant, as it's coming from an unreachable block, we can ignore it). Differential Revision: https://reviews.llvm.org/D41812 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@323738 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22	Merging r322993:	Hans Wennborg
	------------------------------------------------------------------------ r322993 \| kuhar \| 2018-01-19 22:27:24 +0100 (Fri, 19 Jan 2018) \| 16 lines [Dominators] Visit affected node candidates found at different root levels Summary: This patch attempts to fix the DomTree incremental insertion bug found here [[ https://bugs.llvm.org/show_bug.cgi?id=35969 \| PR35969 ]] . When performing an insertion into a piece of unreachable CFG, we may find the same not at different levels. When this happens, the node can turn out to be affected when we find it starting from a node with a lower level in the tree. The level at which we start visitation affects if we consider a node affected or not. This patch tracks the lowest level at which each node was visited during insertion and allows it to be visited multiple times, if it can cause it to be considered affected. Reviewers: brzycki, davide, dberlin, grosser Reviewed By: brzycki Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42231 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@323110 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17	Merging r321751, r321806, and r321878:	Hans Wennborg
	------------------------------------------------------------------------ r321751 \| arsenm \| 2018-01-03 10:45:37 -0800 (Wed, 03 Jan 2018) \| 25 lines StructurizeCFG: Fix broken backedge detection The work order was changed in r228186 from SCC order to RPO with an arbitrary sorting function. The sorting function attempted to move inner loop nodes earlier. This was was apparently relying on an assumption that every block in a given loop / the same loop depth would be seen before visiting another loop. In the broken testcase, a block outside of the loop was encountered before moving onto another block in the same loop. The testcase would then structurize such that one blocks unconditional successor could never be reached. Revert to plain RPO for the analysis phase. This fixes detecting edges as backedges that aren't really. The processing phase does use another visited set, and I'm unclear on whether the order there is as important. An arbitrary order doesn't work, and triggers some infinite loops. The reversed RPO list seems to work and is closer to the order that was used before, minus the arbitary custom sorting. A few of the changed tests now produce smaller code, and a few are slightly worse looking. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r321806 \| arsenm \| 2018-01-04 09:23:24 -0800 (Thu, 04 Jan 2018) \| 4 lines StructurizeCFG: xfail one of the testcases from r321751 It fails with -verify-region-info. This seems to be a issue with RegionInfo itself which existed before. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r321878 \| arsenm \| 2018-01-05 09:51:36 -0800 (Fri, 05 Jan 2018) \| 4 lines RegionInfo: Use report_fatal_error instead of llvm_unreachable Otherwise when using -verify-region-info in a release build the error won't be emitted. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322686 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17	Merging r322106:	Hans Wennborg
	------------------------------------------------------------------------ r322106 \| abataev \| 2018-01-09 11:08:22 -0800 (Tue, 09 Jan 2018) \| 11 lines [COST]Fix PR35865: Fix cost model evaluation for shuffle on X86. Summary: If the vector type is transformed to non-vector single type, the compile may crash trying to get vector information about non-vector type. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41862 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322680 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17	Merging r321870, r321872, and r321994:	Hans Wennborg
	------------------------------------------------------------------------ r321870 \| abataev \| 2018-01-05 07:20:40 -0800 (Fri, 05 Jan 2018) \| 1 line [SLP] Update test checks, NFC. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r321872 \| abataev \| 2018-01-05 08:15:17 -0800 (Fri, 05 Jan 2018) \| 1 line [SLP] Update more test checks, NFC. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r321994 \| abataev \| 2018-01-08 06:43:06 -0800 (Mon, 08 Jan 2018) \| 13 lines [SLP] Fix PR35777: Incorrect handling of aggregate values. Summary: Fixes the bug with incorrect handling of InsertValue\|InsertElement instrucions in SLP vectorizer. Currently, we may use incorrect ExtractElement instructions as the operands of the original InsertValue\|InsertElement instructions. Reviewers: mkuper, hfinkel, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41767 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322675 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17	Merging r322473:	Hans Wennborg
	------------------------------------------------------------------------ r322473 \| a.elovikov \| 2018-01-15 02:56:07 -0800 (Mon, 15 Jan 2018) \| 23 lines [LV] Don't call recordVectorLoopValueForInductionCast for newly-created IV from a trunc. Summary: This method is supposed to be called for IVs that have casts in their use-def chains that are completely ignored after vectorization under PSE. However, for truncates of such IVs the same InductionDescriptor is used during creation/widening of both original IV based on PHINode and new IV based on TruncInst. This leads to unintended second call to recordVectorLoopValueForInductionCast with a VectorLoopVal set to the newly created IV for a trunc and causes an assert due to attempt to store new information for already existing entry in the map. This is wrong and should not be done. Fixes PR35773. Reviewers: dorit, Ayal, mssimpso Reviewed By: dorit Subscribers: RKSimon, dim, dcaballe, hsaito, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41913 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322673 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17	Merging r321993:	Hans Wennborg
	------------------------------------------------------------------------ r321993 \| abataev \| 2018-01-08 06:33:11 -0800 (Mon, 08 Jan 2018) \| 11 lines [SLP] Fix PR35628: Count external uses on extra reduction arguments. Summary: If the vectorized value is marked as extra reduction argument, its users are not considered as external users. Patch fixes this. Reviewers: mkuper, hfinkel, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41786 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322669 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-17	Merging r322056:	Hans Wennborg
	------------------------------------------------------------------------ r322056 \| skatkov \| 2018-01-08 20:37:06 -0800 (Mon, 08 Jan 2018) \| 13 lines [CGP] Fix Complex addressing mode for offset If the offset is differ in two addressing mode we can continue only if ScaleReg is not set due to we will use it as merge of different offsets. It should fix PR35799 and PR35805. Reviewers: john.brawn, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41227 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322645 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-16	Merging r321789:	Hans Wennborg
	------------------------------------------------------------------------ r321789 \| hiraditya \| 2018-01-03 23:47:24 -0800 (Wed, 03 Jan 2018) \| 8 lines [GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases load in case of a loop Reviewers: dberlin sebpop eli.friedman Differential Revision: https://reviews.llvm.org/D41453 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_60@322558 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-03	[InstSimplify] Missed optimization in math expression: squashing exp(log), ↵	Dmitry Venikov
	log(exp) Summary: This patch enables folding following expressions under -ffast-math flag: exp(log(x)) -> x, exp2(log2(x)) -> x, log(exp(x)) -> x, log2(exp2(x)) -> x Reviewers: spatel, hfinkel, davide Reviewed By: spatel, hfinkel, davide Subscribers: scanon, llvm-commits Differential Revision: https://reviews.llvm.org/D41381 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321710 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-03	[InstCombine] Add test to remove VarArg casts (NFC)	Florian Hahn
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321706 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-02	[BasicBlockUtils] Check for unreachable preds before updating LI in ↵	Anna Thomas
	UpdateAnalysisInformation Summary: We are incorrectly updating the LI when loop-simplify generates dedicated exit blocks for a loop. The issue is that there's an implicit assumption that the Preds passed into UpdateAnalysisInformation are reachable. However, this is not true and breaks LI by incorrectly updating the header of a loop. One such case is when we generate dedicated exits when the exit block is a landing pad (through SplitLandingPadPredecessors). There maybe other cases as well, since we do not guarantee that Preds passed in are reachable basic blocks. The added test case shows how loop-simplify breaks LI for the outer loop (and DT in turn) after we try to generate the LoopSimplifyForm. Reviewers: davide, chandlerc, sanjoy Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41519 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321653 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-02	[InstCombine] Missed optimization in math expression: squashing sqrt functions	Dmitry Venikov
	Summary: This patch enables folding under -ffast-math flag sqrt(a) * sqrt(b) -> sqrt(a*b) Reviewers: hfinkel, spatel, davide Reviewed By: spatel, davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D41322 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321637 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-01	[ValueTracking] Don't assume shift values are in range	Simon Pilgrim
	Reduced (as best I could...) from oss-fuzz #4857 test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321634 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-01	[InstCombine] Regenerate udiv tests.	Simon Pilgrim
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321633 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-31	[SimplifyCFG] Stop hoisting musttail calls incorrectly.	Davide Italiano
	PR35774. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321603 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-30	[instsimplify] consistently handle undef and out of bound indices for ↵	Philip Reames
	insertelement and extractelement In one case, we were handling out of bounds, but not undef indices. In the other, we were handling undef (with the comment making the analogy to out of bounds), but not out of bounds. Be consistent and treat both undef and constant out of bounds indices as producing undefined results. As a side effect, this also protects instcombine from having to handle large constant indices as we always simplify first. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321575 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-30	Add another test case for r321489	Philip Reames
	Went to reduce another fuzzer failure to find it's already been fixed, but the test case is slightly different so it's worth adding anyways. Reduced from oss-fuzz #4768 test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321573 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-30	Move tests associated with transforms moved in r321467	Philip Reames
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321572 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-29	[PM] pass -debug-pass-manager flag into FunctionToLoopPassAdaptor's ↵	Fedor Sergeev
	canonicalization PM Summary: New pass manager driver passes DebugPM (-debug-pass-manager) flag into individual PassManager constructors in order to enable debug logging. FunctionToLoopPassAdaptor has its own internal LoopCanonicalizationPM which never gets its debug logging enabled and that means canonicalization passes like LoopSimplify are never present in -debug-pass-manager output. Extending FunctionToLoopPassAdaptor's constructor and createFunctionToLoopPassAdaptor wrapper with an optional boolean DebugLogging argument. Passing debug-logging flags there as appropriate. Reviewers: chandlerc, davide Reviewed By: davide Subscribers: mehdi_amini, eraman, llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D41586 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321548 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-28	Revert r321377, it causes regression to https://reviews.llvm.org/P8055.	Guozhi Wei
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321528 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-28	[RewriteStatepoints] Fix incorrect assertion	Max Kazantsev
	`RewriteStatepointsForGC` iterates over function blocks and their predecessors in order of declaration. One of outcomes of this is that callsites are placed in arbitrary order which has nothing to do with travelsar order. On the other hand, function `recomputeLiveInValues` asserts that bases are added to `Info.PointerToBase` before their deried pointers are updated. But if call sites are processed in order different from RPOT, this is not necessarily true. We cannot guarantee that the base was placed there before every pointer derived from it. All we can guarantee is that this base was marked as known base by this point. This patch replaces the fact that we assert from checking that the base was added to the map with assert that the base was marked as known base. Differential Revision: https://reviews.llvm.org/D41593 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321517 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-28	[InstCombine] Check for isa<Instruction> before using cast<>	Simon Pilgrim
	Protects against casts from constexpr etc. Reduced from oss-fuzz #4788 test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321515 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-28	Revert "[memcpyopt] Teach memcpyopt to optimize across basic blocks"	Reid Kleckner
	This reverts r321138. It seems there are still underlying issues with memdep. PR35519 seems to still be present if debug info is enabled. We end up losing a memcpy. Somehow during store to memset merging, we insert the memset after the memcpy or fail to update the memdep analysis to account for the newly inserted memset of a pair. Reduced test case: #include <assert.h> #include <stdio.h> #include <string> #include <utility> #include <vector> void do_push_back( std::vector<std::pair<std::string, std::vector<std::string>>>* crls) { crls->push_back(std::make_pair(std::string(), std::vector<std::string>())); } int __attribute__((optnone)) main() { // Put some data in the vector and then remove it so we take the push_back // fast path. std::vector<std::pair<std::string, std::vector<std::string>>> crl_set; crl_set.push_back({"asdf", {}}); crl_set.pop_back(); printf("first word in vector storage: %p\n", (void)crl_set.data()); // Do the push_back which may fail to initialize the data. do_push_back(&crl_set); auto first = &crl_set.back().first; printf("first word in vector storage (should be zero): %p\n", (void*)crl_set.data()); assert(first->empty()); puts("ok"); } Compile with libc++, enable optimizations, and enable debug info: $ clang++ -stdlib=libc++ -g -O2 t.cpp -o t.exe -Wl,-rpath=llvm/build/lib This program will assert with this change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321510 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-27	[InstCombine] add tests for min/max folds (PR35717); NFC	Sanjay Patel
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321500 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-27	[InstCombine] Gracefully handle out of range extractelement indices	Simon Pilgrim
	InstSimplify is responsible for handling these, but we shouldn't just assert here. Reduced from oss-fuzz #4808 test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321489 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-27	[instcombine] add powi(x, 2) -> x * x	Philip Reames
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321468 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-26	[Unroll][DebugInfo] Propagate loop body's debug location to epilog preheader	Zhaoshi Zheng
	NewExit and epilog PreHeader should has the same debug loc as the original loop body, instead of original loop exit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321465 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-26	[InstCombine] fix miscompile of frem with 0.0 operand (PR34870)	Sanjay Patel
	We might want to select NAN here or do this transform with fast-math, but this should at least fix the miscompile. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321461 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-26	[InstCombine] add test for frem with 0.0 (PR34870); NFC	Sanjay Patel
	git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321460 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-26	[ValueTracking] ignore FP signed-zero when detecting a casted-to-integer ↵	Sanjay Patel
	fmin/fmax pattern This is a preliminary step for the patch discussed in D41136 (and denoted here with the FIXME comment). When we match an FP min/max that is cast to integer, any intermediate difference between +0.0 or -0.0 should be muted in the result by the conversion (either fptosi or fptoui) of the result. Thus, we can enable 'nsz' for the purpose of matching fmin/fmax. Note that there's probably room to generalize this more, possibly by fixing the current calls to the weak version of isKnownNonZero() in matchSelectPattern() to the more powerful recursive version. Differential Revision: https://reviews.llvm.org/D41333 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321456 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-26	[InstSimplify] Check for in range extraction index before calling ↵	Simon Pilgrim
	APInt::getZExtValue() Reduced from oss-fuzz #4768 test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321454 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-23	[CallSiteSplitting] Remove isOrHeader restriction.	Florian Hahn
	By following the single predecessors of the predecessors of the call site, we do not need to restrict the control flow. Reviewed By: junbuml, davide Differential Revision: https://reviews.llvm.org/D40729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321413 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22	[SimplifyCFG] Don't do if-conversion if there is a long dependence chain	Guozhi Wei
	If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor. This patch checks for the long dependence chain, and give up if-conversion if find one. Differential Revision: https://reviews.llvm.org/D39352 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321377 91177308-0d34-0410-b5e6-96231b3b80d8