Age | Commit message (Collapse) | Author |
|
|
|
------------------------------------------------------------------------
r339260 | syzaara | 2018-08-08 08:20:43 -0700 (Wed, 08 Aug 2018) | 13 lines
[PowerPC] Improve codegen for vector loads using scalar_to_vector
This patch aims to improve the codegen for vector loads involving the
scalar_to_vector (load X) sequence. Initially, ld->mv instructions were used
for scalar_to_vector (load X), so this patch allows scalar_to_vector (load X)
to utilize:
LXSD and LXSDX for i64 and f64
LXSIWAX for i32 (sign extension to i64)
LXSIWZX for i32 and f64
Committing on behalf of Amy Kwan.
Differential Revision: https://reviews.llvm.org/D48950
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@347957 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r344591 | abeserminji | 2018-10-16 01:27:28 -0700 (Tue, 16 Oct 2018) | 11 lines
[mips][micromips] Fix how values in .gcc_except_table are calculated
When a landing pad is calculated in a program that is compiled
for micromips, it will point to an even address. Such an error will
cause a segmentation fault, as the instructions in micromips are
aligned on odd addresses. This patch sets the last bit of the offset
where a landing pad is, to 1, which will effectively be
an odd address and point to the instruction exactly.
Differential Revision: https://reviews.llvm.org/D52985
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@347028 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r344516 | abeserminji | 2018-10-15 07:39:12 -0700 (Mon, 15 Oct 2018) | 12 lines
[mips][micromips] Fix overlaping FDEs error
When compiling static executable for micromips, CFI symbols
are incorrectly labeled as MICROMIPS, which cause
".eh_frame_hdr refers to overlapping FDEs." error.
This patch does not label CFI symbols as MICROMIPS, and FDEs do not
overlap anymore. This patch also exposes another bug, which is fixed
here: https://reviews.llvm.org/D52985
Differential Revision: https://reviews.llvm.org/D52987
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@347023 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r342946 | smaksimovic | 2018-09-24 23:27:49 -0700 (Mon, 24 Sep 2018) | 6 lines
[mips] Correct MUL pattern for mips64
Guard existing pattern with a predicate, introduce a new one for revision 6.
Differential Revision: https://reviews.llvm.org/D51684
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@346742 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r342884 | petarj | 2018-09-24 07:14:19 -0700 (Mon, 24 Sep 2018) | 12 lines
[Mips][FastISel] Fix selectBranch on icmp i1
The r337288 tried to fix result of icmp i1 when its input is not sanitized
by falling back to DagISel. While it now produces the correct result for
bit 0, the other bits can still hold arbitrary value which is not supported
by MipsFastISel branch lowering. This patch fixes the issue by falling back
to DagISel in this case.
Patch by Dragan Mladjenovic.
Differential Revision: https://reviews.llvm.org/D52045
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@346741 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r341919 | atanasyan | 2018-09-11 02:57:25 -0700 (Tue, 11 Sep 2018) | 18 lines
[mips] Add a pattern for 64-bit GPR variant of the `rdhwr` instruction
MIPS ISAs start to support third operand for the `rdhwr` instruction
starting from Revision 6. But LLVM generates assembler code with
three-operands version of this instruction on any MIPS64 ISA. The third
operand is always zero, so in case of direct code generation we get
correct code.
This patch fixes the bug by adding an instruction alias. The same alias
already exists for 32-bit ISA.
Ideally, we also need to reject three-operands version of the `rdhwr`
instruction in an assembler code if ISA revision is less than 6. That is
a task for a separate patch.
This fixes PR38861 (https://bugs.llvm.org/show_bug.cgi?id=38861)
Differential revision: https://reviews.llvm.org/D51773
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@346739 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r341221 | atanasyan | 2018-08-31 08:57:17 -0700 (Fri, 31 Aug 2018) | 12 lines
[mips] Fix `mtc1` and `mfc1` definitions for microMIPS R6
The `mtc1` and `mfc1` definitions in the MipsInstrFPU.td have MMRel,
but do not have StdMMR6Rel tags. When these instructions are emitted
for microMIPS R6 targets, `Mips::MipsR62MicroMipsR6` nor
`Mips::Std2MicroMipsR6` cannot find correct op-codes and as a result the
backend uses mips32 variant of the instructions encoding.
The patch fixes this problem by adding the StdMMR6Rel tag and check
instructions encoding in the test case.
Differential revision: https://reviews.llvm.org/D51482
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@346737 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340932 | atanasyan | 2018-08-29 07:54:01 -0700 (Wed, 29 Aug 2018) | 11 lines
[mips] Fix microMIPS unconditional branch offset handling
MipsSEInstrInfo class defines for internal purpose unconditional
branches as Mips::B nad Mips:J even in case of microMIPS code
generation. Under some conditions that leads to the bug - for rather long
branch which fits to Mips jump instruction offset size, but does not fit
to microMIPS jump offset size, we generate 'short' branch and later show
an error 'out of range PC16 fixup' after check in the isBranchOffsetInRange
routine.
Differential revision: https://reviews.llvm.org/D50615
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@346736 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340931 | atanasyan | 2018-08-29 07:53:55 -0700 (Wed, 29 Aug 2018) | 6 lines
[mips] Involves microMIPS's jump in the analyzable branch set
Involves microMIPS's jump in the analyzable branch set to reduce some
code patterns.
Differential revision: https://reviews.llvm.org/D50613
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@346735 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340927 | vstefanovic | 2018-08-29 07:07:14 -0700 (Wed, 29 Aug 2018) | 14 lines
[mips] Prevent shrink-wrap for BuildPairF64, ExtractElementF64 when they use $sp
For a certain combination of options, BuildPairF64_{64}, ExtractElementF64{_64}
may be expanded into instructions using stack.
Add implicit operand $sp for such cases so that ShrinkWrapping doesn't move
prologue setup below them.
Fixes MultiSource/Benchmarks/MallocBench/cfrac for
'--target=mips-img-linux-gnu -mcpu=mips32r6 -mfpxx -mnan=2008'
and
'--target=mips-img-linux-gnu -mcpu=mips32r6 -mfp64 -mnan=2008 -mno-odd-spreg'.
Differential Revision: https://reviews.llvm.org/D50986
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@346734 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r343373 | rksimon | 2018-09-29 06:25:22 -0700 (Sat, 29 Sep 2018) | 3 lines
[X86][SSE] Fixed issue with v2i64 variable shifts on 32-bit targets
The shift amount might have peeked through a extract_subvector, altering the number of vector elements in the 'Amt' variable - so we were incorrectly calculating the ratio when peeking through bitcasts, resulting in incorrectly detecting splats.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@344810 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r343428 | ctopper | 2018-09-30 16:43:30 -0700 (Sun, 30 Sep 2018) | 3 lines
[X86] Change an llvm_unreachable to a report_fatal_error so the optimizer will stop making us reach the other report_fatal_error in this function.
There's a conditional report_fatal_error just above this llvm_unreachable. The optimizer when seeing the unreachable removes the conditional and just makes any other error trigger the existing report_fatal_error.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@344805 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r343443 | ctopper | 2018-10-01 00:08:41 -0700 (Mon, 01 Oct 2018) | 9 lines
[X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers.
We can only copy between a k-register and a GR32/GR64 register.
This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure.
This probably isn't the best fix, and we should probably figure out how to handle this correctly.
Fixes PR38803.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@344804 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r341642 | tnorthover | 2018-09-07 11:21:25 +0200 (Fri, 07 Sep 2018) | 8 lines
ARM: fix Thumb2 CodeGen for ldrex with folded frame-index.
Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a
FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a
proper addressing-mode and tells the rewriter about it so that encodable
offsets are exploited and others are rejected.
Should fix PR38828.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@341783 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r341512 | ctopper | 2018-09-06 04:03:14 +0200 (Thu, 06 Sep 2018) | 7 lines
[X86][Assembler] Allow %eip as a register in 32-bit mode for .cfi directives.
This basically reverts a change made in r336217, but improves the text of the error message for not allowing IP-relative addressing in 32-bit mode.
Fixes PR38826.
Patch by Iain Sandoe.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@341530 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340959 | mareko | 2018-08-29 22:03:00 +0200 (Wed, 29 Aug 2018) | 9 lines
AMDGPU: Handle 32-bit address wraparounds for SMRD opcodes
Summary: This fixes GPU hangs with OpenGL bindless handle arithmetic.
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D51203
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@341351 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340417 | hakzsam | 2018-08-22 18:08:48 +0200 (Wed, 22 Aug 2018) | 14 lines
AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space
32-bit constant address space is declared as 6, so the
maximum number of address spaces is 6, not 5.
Fixes "LLVM ERROR: Pointer address space out of range".
v5: rename MAX_COMMON_ADDRESS to MAX_AMDGPU_ADDRESS
v4: - fix compilation issues
- fix out of bounds access
v3: use static_assert()
v2: add a very simple test for 32-bit addr space
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106630
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@341041 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340416 | hakzsam | 2018-08-22 18:08:43 +0200 (Wed, 22 Aug 2018) | 8 lines
AMDGPU: fix existing alias rules for constant and global
Constant and global may alias, also one rules table wasn't
ordered correctly.
Pinpointed by Matt.
v2: add a test with swapped parameters
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@341040 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340455 | yhs | 2018-08-22 23:21:03 +0200 (Wed, 22 Aug 2018) | 38 lines
bpf: fix an assertion in BPFAsmBackend applyFixup()
Fix bug https://bugs.llvm.org/show_bug.cgi?id=38643
In BPFAsmBackend applyFixup(), there is an assertion for FixedValue to be 0.
This may not be true, esp. for optimiation level 0.
For example, in the above bug, for the following two
static variables:
@bpf_map_lookup_elem = internal global i8* (i8*, i8*)*
inttoptr (i64 1 to i8* (i8*, i8*)*), align 8
@bpf_map_update_elem = internal global i32 (i8*, i8*, i8*, i64)*
inttoptr (i64 2 to i32 (i8*, i8*, i8*, i64)*), align 8
The static variable @bpf_map_update_elem will have a symbol
offset of 8 and a FK_SecRel_8 with FixupValue 8 will cause
the assertion if llvm is built with -DLLVM_ENABLE_ASSERTIONS=ON.
The above relocations will not exist if the program is compiled
with optimization level -O1 and above as the compiler optimizes
those static variables away. In the below error message, -O2
is suggested as this is the common practice.
Note that FixedValue = 0 in applyFixup() does exist and is valid,
e.g., for the global variable my_map in the above bug. The bpf
loader will process them properly for map_id's before loading
the program into the kernel.
The static variables, which are not optimized away by compiler,
may have FK_SecRel_8 relocation with non-zero FixedValue.
The patch removed the offending assertion and will issue
a hard error as below if the FixedValue in applyFixup()
is not 0.
$ llc -march=bpf -filetype=obj fixup.ll
LLVM ERROR: Unsupported relocation: try to compile with -O2 or above,
or check your static variable usage
Signed-off-by: Yonghong Song <yhs@fb.com>
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@341038 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r340158 | s.desmalen | 2018-08-20 11:16:59 +0200 (Mon, 20 Aug 2018) | 16 lines
[AArch64][SVE] Asm: Add SVE System registers
This patch adds system registers for controlling aspects of SVE:
- ZCR_EL1 (r/w) visible at EL1 and EL0.
- ZCR_EL2 (r/w) visible at EL2 and Non-secure EL1 and EL0.
- ZCR_EL3 (r/w) visible at all exception levels.
and a system register identifying SVE:
- ID_AA64ZFR0_EL1 (r) SVE Feature identifier.
Reviewers: SjoerdMeijer, samparker, pbarrio, fhahn, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D50885
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@340355 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r339895 | niravd | 2018-08-16 18:31:14 +0200 (Thu, 16 Aug 2018) | 13 lines
[MC][X86] Enhance X86 Register expression handling to more closely match GCC.
Allow the comparison of x86 registers in the evaluation of assembler
directives. This generalizes and simplifies the extension from r334022
to catch another case found in the Linux kernel.
Reviewers: rnk, void
Reviewed By: rnk
Subscribers: hiraditya, nickdesaulniers, llvm-commits
Differential Revision: https://reviews.llvm.org/D50795
------------------------------------------------------------------------
------------------------------------------------------------------------
r339896 | d0k | 2018-08-16 18:50:23 +0200 (Thu, 16 Aug 2018) | 1 line
[MC] Remove unused variable
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@340329 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r339945 | ctopper | 2018-08-16 23:54:02 +0200 (Thu, 16 Aug 2018) | 9 lines
[X86] In EFLAGS copy pass, don't emit EXTRACT_SUBREG instructions since we're after peephole
Normally the peephole pass converts EXTRACT_SUBREG to COPY instructions. But we're after peephole so we can't rely on it to clean these up.
To fix this, the eflags pass now emits a COPY with a subreg input.
I also noticed that in 32-bit mode we need to constrain the input to the copy to ensure the subreg is valid. Otherwise we'll fail verify-machineinstrs
Differential Revision: https://reviews.llvm.org/D50656
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@339999 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r339769 | nemanjai | 2018-08-15 14:58:13 +0200 (Wed, 15 Aug 2018) | 12 lines
[PowerPC] Don't run BV DAG Combine before legalization if it assumes legal types
When trying to combine a DAG that builds a vector out of sign-extensions of
vector extracts, the code assumes legal input types. Due to that, we have to
disable this combine prior to legalization.
In some cases, the DAG will look slightly different after legalization so
account for that in the matching code.
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=38087
Differential Revision: https://reviews.llvm.org/D49080
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@339859 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r339316 | hahnfeld | 2018-08-09 09:45:49 +0200 (Thu, 09 Aug 2018) | 16 lines
[NVPTX] Select atomic loads and stores
According to PTX ISA .volatile has the same memory synchronization
semantics as .relaxed.sys, so it can be used to implement monotonic
atomic loads and stores. This is important for OpenMP's atomic
construct where
- 'read's and 'write's are lowered to atomic loads and stores, and
- an update of float or double types are lowered into a cmpxchg loop.
(Note that PTX could do better because it has atom.add.f{32,64} but
LLVM's atomicrmw instruction only allows integer types.)
Higher levels of atomicity (like acquire and release) need additional
synchronization properties which were added with PTX ISA 6.0 / sm_70.
So using these instructions still results in an error.
Differential Revision: https://reviews.llvm.org/D50391
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@339338 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r339190 | jvesely | 2018-08-07 23:54:37 +0200 (Tue, 07 Aug 2018) | 12 lines
AMDGPU: Remove broken i16 ternary patterns
Fixup test to check for GCN prefix
These patterns always zero extend the result even though it might need sign extension.
This has been broken since the addition of i16 support.
It has popped up in mad_sat(char) test since min(max()) combination is turned into v_med3, resulting in the following (incorrect) sequence:
v_mad_i16 v2, v10, v9, v11
v_med3_i32 v2, v2, v8, v7
Fixes mad_sat(char) piglit on VI.
Differential Revision: https://reviews.llvm.org/D49836
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@339235 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r338610 | jvesely | 2018-08-01 20:36:07 +0200 (Wed, 01 Aug 2018) | 3 lines
AMDGPU/R600: Convert kernel param loads to use PARAM_I_ADDRESS
Non ext aligned i32 loads are still optimized to use CONSTANT_BUFFER (AS 8)
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@339105 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r338569 | jvesely | 2018-08-01 17:04:36 +0200 (Wed, 01 Aug 2018) | 5 lines
AMDGPU: Allow fp32-denormals feature for r600 targets
This was accidentally removed in r335942.
Differential Revision: https://reviews.llvm.org/D49934
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@339103 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r338599 | vlad.tsyrklevich | 2018-08-01 19:44:37 +0200 (Wed, 01 Aug 2018) | 16 lines
[X86] FastISel fall back on !absolute_symbol GVs
Summary:
D25878, which added support for !absolute_symbol for normal X86 ISel,
did not add support for materializing references to absolute symbols for
X86 FastISel. This causes build failures because FastISel generates
PC-relative relocations for absolute symbols. Fall back to normal ISel
for references to !absolute_symbol GVs. Fix for PR38200.
Reviewers: pcc, craig.topper
Reviewed By: pcc
Subscribers: hiraditya, llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D50116
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@338847 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r338554 | bryanpkc | 2018-08-01 15:50:29 +0200 (Wed, 01 Aug 2018) | 11 lines
[AArch64] Fix FCCMP with FP16 operands
Summary: This patch adds support for FCCMP instruction with FP16 operands, avoiding an assertion during instruction selection.
Reviewers: olista01, SjoerdMeijer, t.p.northover, javed.absar
Reviewed By: SjoerdMeijer
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D50115
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@338692 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
------------------------------------------------------------------------
r338658 | nemanjai | 2018-08-02 02:03:22 +0200 (Thu, 02 Aug 2018) | 13 lines
[PowerPC] Do not round values prior to converting to integer
Adding the FP_ROUND nodes when combining FP_TO_[SU]INT of elements
feeding a BUILD_VECTOR into an FP_TO_[SU]INT of the built vector
loses precision. This patch removes the code that adds these nodes
to true f64 operands. It also adds patterns required to ensure
the code is still vectorized rather than converting individual
elements and inserting into a vector.
Fixes https://bugs.llvm.org/show_bug.cgi?id=38342
Differential Revision: https://reviews.llvm.org/D50121
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_70@338678 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338530 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Summary:
Add _L to _LZ image intrinsic table mapping to table gen.
In ISelLowering check if image intrinsic has lod and if it's equal
to zero, if so remove lod and change opcode to equivalent mapped _LZ.
Change-Id: Ie24cd7e788e2195d846c7bd256151178cbb9ec71
Subscribers: arsenm, mehdi_amini, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D49483
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338523 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
The DAG combiner logic to simplify AND masks in shift counts is invalid.
While it is true that the SystemZ shift instructions ignore all but the
low 6 bits of the shift count, it is still invalid to simplify the AND
masks while the DAG still uses the standard shift operators (which are
*not* defined to match the SystemZ instruction behavior).
Instead, this patch performs equivalent operations during instruction
selection. For completely removing the AND, this now happens via
additional DAG match patterns implemented by a multi-alternative
PatFrags. For simplifying a 32-bit AND to a 16-bit AND, the existing DAG
patterns were already mostly OK, they just needed an output XForm to
actually truncate the immediate value.
Unfortunately, the latter change also exposed a bug in TableGen: it
seems XForms are currently only handled correctly for direct operands of
the outermost operation node. This patch also fixes that bug by simply
recurring through the whole pattern. This should be NFC for all other
targets.
Differential Revision: https://reviews.llvm.org/D50096
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338521 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338516 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Differential Revision: https://reviews.llvm.org/D49243
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338507 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Select G_GLOBAL_VALUE for position dependent code.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D49803
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338499 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338496 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.
Patch by: yrouban (Yevgeny Rouban)
Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00
Reviewed By: tejohnson, xbolva00
Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith
Differential Revision: https://reviews.llvm.org/D49412
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338494 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Also add a test for it being unsupported for linux.
Differential Revision: https://reviews.llvm.org/D49929
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338493 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
(CTTZ X), -1, (X != 0)), C), make sure we really have a compare with 0.
It's not strictly required by the transform of the cmov and the add, but it makes sure we restrict it to the cases we know we want to match.
While there canonicalize the operand order of the cmov to simplify the matching and emitting code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338492 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
EFLAGS copy lowering.
If you have a branch of LLVM, you may want to cherrypick this. It is
extremely unlikely to hit this case empirically, but it will likely
manifest as an "impossible" branch being taken somewhere, and will be
... very hard to debug.
Hitting this requires complex conditions living across complex control
flow combined with some interesting memory (non-stack) initialized with
the results of a comparison. Also, because you have to arrange for an
EFLAGS copy to be in *just* the right place, almost anything you do to
the code will hide the bug. I was unable to reduce anything remotely
resembling a "good" test case from the place where I hit it, and so
instead I have constructed synthetic MIR testing that directly exercises
the bug in question (as well as the good behavior for completeness).
The issue is that we would mistakenly assume any SETcc with a valid
condition and an initial operand that was a register and a virtual
register at that to be a register *defining* SETcc...
It isn't though....
This would in turn cause us to test some other bizarre register,
typically the base pointer of some memory. Now, testing this register
and using that to branch on doesn't make any sense. It even fails the
machine verifier (if you are running it) due to the wrong register
class. But it will make it through LLVM, assemble, and it *looks*
fine... But wow do you get a very unsual and surprising branch taken in
your actual code.
The fix is to actually check what kind of SETcc instruction we're
dealing with. Because there are a bunch of them, I just test the
may-store bit in the instruction. I've also added an assert for sanity
that ensure we are, in fact, *defining* the register operand. =D
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338481 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Differential Revision: https://reviews.llvm.org/D49874
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338470 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Disable ARMCodeGenPrepare by default again. It is causing verifier
failues in V8 that look like:
Duplicate integer as switch case
switch i32 %trunc, label %if.end13 [
i32 0, label %cleanup36
i32 0, label %if.then8
], !dbg !4981
i32 0
fatal error: error in backend: Broken function found, compilation aborted!
I will continue reducing the test case and send it along.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338452 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Use '&&' before the string instead of '||'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338429 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338421 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
This improves code for the same reasons as scalarizing 32-bit
element vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338418 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
When lowering calling conventions, prefer to decompose vectors
into the constitute register types. This avoids artifical constraints
to satisfy a wide super-register.
This improves code quality because now optimizations don't need to
deal with the super-register constraint. For example the immediate
folding code doesn't deal with 4 component reg_sequences, so by
breaking the register down earlier the existing immediate folding
code is able to work.
This also avoids the need for the shader input processing code
to manually split vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338416 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
Don't declare them as X86SchedWritePair when the folded class will never be used.
Note: MOVBE (load/store endian conversion) instructions tend to have a very different behaviour to BSWAP.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338412 91177308-0d34-0410-b5e6-96231b3b80d8
|
|
As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering.
Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW.
Differential Revision: https://reviews.llvm.org/D49562
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338407 91177308-0d34-0410-b5e6-96231b3b80d8
|