Age | Commit message (Collapse) | Author |
|
This pass allows to declare pointer non-aliasing on loop level
within the given function.
If such non-aliasing is given, then we outline the loop into
its own function. The arguments of the new function are all
variables, which are used in the loop, where pointers are
declared as restricted types (non-aliasing).
This allows to optimize the loop more aggressively.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
This patch introduces the flag -fnoalias to bypass alias-checks
in the SSA alias analyser.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
This flag triggers more aggressive loop unrolling in tree-predcom.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
In case early optimisation passes move/copy SSA names into other functions
(and the corresponding definition statements), then we might end up with
an SSA name without a definition statement. However get_default_value()
is not prepared for such a case and crashes (dereferencing a NULL
pointer).
This patch addresses this issue by treating a missing defintion
statement in get_default_value() as a variable, which is used before
being initialized.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
Assuming a common subexpression, GCC will always keep it separate
instead of fusing it into two instructions (even when this would have
the lowest overall cost).
Assume the following code example (a multiply that could be syntesized
as (b + b << 2) << 2 or 4 * (b + 4b)):
signed long mult_20 ( signed int b )
{
return (signed long)b*20;
}
Due to a common subexpression (i.e. the sign-extended argument b) this
will result in:
mult_20:
sxtw x0, w0
add x0, x0, x0, lsl 2
lsl x0, x0, 2
ret
At 4 cycles latency, this is in stark contrast with the optimal solution
(which takes just 2 cycles and fuses the sign-extension into two
dependent instructions):
mult_20:
sbfiz x1, x0, 4, 32
add x0, x1, x0, sxtw 2
ret
To resolve this, a separate un-combine pass has been proposed
(originally by Philipp Tomsich with additional improvements by Chris Nelson).
The algorithm for this pass is:
for RTX expression A, identify all dependent RTX expressions B[0?n]
for each combination A -> B[i], check whether the combined RTX
expression is a valid instruction with the same cost as B[i], and
perform the following RTX changes:
replace B[i] with a new RTX expression B'[i], which is the
combination A -> B[i]
link each B'[i] to depend on the dependencies of A
replace B[i] in the dependecy-list of its dependent operations
with a dependency on B'[i] (i.e. unlink B[i])
remove B[i] (as this is dead code now) ? or leave it dangling for
the DCE-pass to clean up behind us.
if all combinations are valid, remove A (as this is dead code now) ?
or leave it dangling for the DCE-pass to clean up behind us.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
This pass can be activated with -flist-find-pipeline.
This patch optimizes a common linked list idiom (example from
Coremark's 'core_list_find') of the following form:
while (list && (list->info->idx != info->idx))
list=list->next;
return list;
This idiom introduces a number of dependent-loads across the code path.
However, the dereference of list and the assignment of
list_{i+1} = list_{i}->next (i ... iteration)
only depends on the first condition (i.e. “list != NULL”) and can be
moved earlier.
The [list-find pass] is an experimental pass (to be generalised in a
next step) that provides a targeted implementation of a software pipeliner
for loops iterating over linked list and hoisting the list = list->next
dereference (for the next iteration) above the comparison of the index field.
In SSA form, the loop should thus become (all conditions need to be
inverted):
if (!list_{i})
return NULL; // forward-propagate from the
// if-condition
list_{i+1} = list_{i};
if (list_{i}->idx == info->idx)
return list{i};
which should be unrolled at least once to allow using two distinct registers
for list_{i} and list_{i+1} and avoids any additional register moves.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
post_modify has also costs of 1.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
There were a couple of off-by-one costs.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
Bypass table-based cost-model with a procedural one to more closely
the eMAG/Xgene microarchitecture.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
This patch adds support for Ampere Computing's eMAG processor.
Tested on aarch64 (no regressions seen).
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
The aarch64 ISA specification allows a left shift amount to be applied
after extension in the range of 0 to 4 (encoded in the imm3 field).
This is true for at least the following instructions:
* ADD (extend register)
* ADDS (extended register)
* SUB (extended register)
The result of this patch can be seen, when compiling the following code:
uint64_t myadd(uint64_t a, uint64_t b)
{
return a+(((uint8_t)b)<<4);
}
Without the patch the following sequence will be generated:
0000000000000000 <myadd>:
0: d37c1c21 ubfiz x1, x1, #4, #8
4: 8b000020 add x0, x1, x0
8: d65f03c0 ret
With the patch the ubfiz will be merged into the add instruction:
0000000000000000 <myadd>:
0: 8b211000 add x0, x0, w1, uxtb #4
4: d65f03c0 ret
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
This reverts commit 20496f0e660347d8d2d885dff2702182853602bd.
The reason for the revert is, that it shows a significant
performance slowdown on some benchmarks.
After reverting the commit, the testcase (pr82697.c) still passes.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
The compiler option -mindirect-branch=<value> converts indirect
branch-and-link-register and branch-register instructions according to <value>.
The default is ``keep``, which keeps indirect branch-and-link-register and
branch-register instructions unmodified.
``thunk`` converts indirect branch-and-link-register/branch-register
instructions to a branch-and-link/branch to a function containing a retpoline
(to stop speculative execution) followed by a branch-register to the target.
``thunk-inline`` is similar to ``thunk``, but inlines the retpoline
before the branch-and-link-register/branch-register instruction.
``thunk-extern`` is also similar to ``thunk``, but does not insert the
functions containing the retpoline. When using this option, these functions
need to be provided in a separate object file. The retpoline functions exist
for each register and are named ``__aarch64_indirect_thunk_xN`` (N being the
register number).
It is also possible to override the indirect-branch setting for
individual fuctions using the function attribute ``indirect_branch``.
The actual retpoline instruction sequence, which prevents speculative
indirect branches looks like this::
str x30, [sp, #-16]!
bl 101f
100: //speculation trap
wfe
b 100b
101: //do ROP
adr x30, 102f
ret
102: //non-spec code
ldr x30, [sp], #16
This patch has been tested with the included testcases and various other
source bases (benchmarks, retpoline-patched arm64 kernel, etc.).
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262992 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262987 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262983 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262968 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262956 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262939 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262926 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262920 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262916 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262896 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
* simple-object-elf.c (ENOTSUP): If not defined by errno.h, redirect
to ENOSYS.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262873 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262869 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
2018-07-18 Carl Love <cel@us.ibm.com>
Backport from mainline
2018-07-16 Carl Love <cel@us.ibm.com>
PR target/86414
* gcc.target/powerpc/divkc3-2.c: Add dg-require-effective-target
longdouble128.
* gcc.target/powerpc/divkc3-3.c: Ditto.
* gcc.target/powerpc/mulkc3-2.c: Ditto.
* gcc.target/powerpc/mulkc3-3.c: Ditto.
* gcc.target/powerpc/fold-vec-mergehl-double.c: Update counts.
* gcc.target/powerpc/pr85456.c: Make check Linux and AIX specific.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262865 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
PR middle-end/85602 - -Wsizeof-pointer-memaccess for strncat with size of source
gcc/c-family/ChangeLog:
PR middle-end/85602
* c-warn.c (sizeof_pointer_memaccess_warning): Check for attribute
nonstring.
gcc/ChangeLog:
PR middle-end/85602
* calls.c (maybe_warn_nonstring_arg): Handle strncat.
* tree-ssa-strlen.c (is_strlen_related_p): Make extern.
Handle integer subtraction.
(maybe_diag_stxncpy_trunc): Handle nonstring source arguments.
* tree-ssa-strlen.h (is_strlen_related_p): Declare.
* doc/invoke.texi (-Wstringop-truncation): Update.
gcc/testsuite/ChangeLog:
PR middle-end/85602
* gcc.dg/attr-nonstring-2.c: Adjust text of expected warning.
* c-c++-common/attr-nonstring-8.c: New test.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262859 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
* pt.c (find_parameter_packs_r) [IF_STMT]: Don't walk into
IF_STMT_EXTRA_ARGS.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262858 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
These tests fail when run with -D_GLIBCXX_USE_CXX11_ABI=0
Backport from mainline
2018-07-05 Jonathan Wakely <jwakely@redhat.com>
* testsuite/21_strings/basic_string/cons/char/deduction.cc: XFAIL for
COW strings.
* testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc:
Likewise.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262857 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
The additions to <experimental/random> were added in 2015 but the new
algorithms in <experimental/algorithm> were not. This adds them.
Also define "random_device" effective target to fix testsuite failures
on bare metal targets without std::random_device. The effective target
currently only matches targets where _GLIBCXX_USE_RANDOM_TR1 is defined,
which means /dev/random and /dev/urandom are usable.
Backport from mainline
2018-07-04 Jonathan Wakely <jwakely@redhat.com>
* testsuite/25_algorithms/make_heap/complexity.cc: Require effective
target for std::random_device.
* testsuite/26_numerics/random/random_device/cons/default.cc:
Likewise.
* testsuite/experimental/algorithm/sample-2.cc: Likewise.
* testsuite/experimental/algorithm/shuffle.cc: Likewise.
* testsuite/experimental/random/randint.cc: Likewise.
* testsuite/lib/libstdc++.exp
(check_effective_target_random_device): New proc.
Backport from mainline
2018-06-26 David Edelsohn <dje.gcc@gmail.com>
* testsuite/experimental/algorithm/sample-2.cc: Add TLS DejaGNU
directives.
* testsuite/experimental/algorithm/shuffle.cc: Likewise.
Backport from mainline
2018-06-25 Jonathan Wakely <jwakely@redhat.com>
* include/experimental/algorithm (sample, shuffle): Add new overloads
using per-thread random number engine.
* testsuite/experimental/algorithm/sample.cc: Simpify and reduce
dependencies by using __gnu_test::test_container.
* testsuite/experimental/algorithm/sample-2.cc: New.
* testsuite/experimental/algorithm/shuffle.cc: New.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262856 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262845 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
an operand of Character type. Factor out range generation to the end.
Check that the bounds are literals and convert them to the type of the
operand before building the ranges.
* gcc-interface/utils.c (make_dummy_type): Minor tweak.
(make_packable_type): Propagate TYPE_DEBUG_TYPE.
(maybe_pad_type): Likewise.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262813 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
more rvalues in the expression of a renaming.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262808 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262763 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
gcc/testsuite/ChangeLog:
PR fortran/83184
* gfortran.dg/dec_structure_23.f90: Oops, "un-fix" error messages.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262748 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
gcc/testsuite/ChangeLog:
PR fortran/83184
Backport from trunk.
* gfortran.dg/assumed_rank_14.f90: New testcase.
* gfortran.dg/assumed_rank_15.f90: New testcase.
* gfortran.dg/dec_structure_8.f90: Update error messages.
* gfortran.dg/dec_structure_23.f90: Update error messages.
gcc/fortran/ChangeLog:
PR fortran/83184
Backport from trunk.
* decl.c (match_old_style_init): Initialize locus of variable expr when
creating a data variable.
(match_clist_expr): Verify array is explicit shape/size before
attempting to allocate constant array constructor.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262747 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
Backport r262442 and r262743.
gcc/fortran/ChangeLog:
Backport from trunk:
PR fortran/86417
* module.c (mio_component): Set component->loc when loading from module.
PR fortran/83183
PR fortran/86325
* expr.c (class_allocatable, class_pointer, comp_allocatable,
comp_pointer): New helpers.
(component_initializer): Generate EXPR_NULL for allocatable or pointer
components. Do not generate initializers for components within BT_CLASS.
Do not assign to comp->initializer.
(gfc_generate_initializer): Use new helpers; move code to generate
EXPR_NULL for class allocatable components into component_initializer().
gcc/testsuite/ChangeLog:
Backport from trunk:
PR fortran/83183
PR fortran/86325
* gfortran.dg/init_flag_18.f90: New testcase.
* gfortran.dg/init_flag_19.f03: New testcase.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262746 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
2018-06-16 Claudiu Zissulescu <claziss@synopsys.com>
Backport from mainline
2018-06-12 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc-protos.h (arc_pad_return): Remove.
* config/arc/arc.c (machine_function): Remove force_short_suffix
and size_reason.
(arc_print_operand): Adjust printing of '&'.
(arc_verify_short): Remove conditional printing of short suffix.
(arc_final_prescan_insn): Remove reference to size_reason.
(pad_return): New function.
(arc_reorg): Call pad_return.
(arc_pad_return): Remove.
(arc_init_machine_status): Remove reference to force_short_suffix.
* config/arc/arc.md (vunspec): Add VUNSPEC_ARC_BLOCKAGE.
(attr length): When attribute iscompact is true force to 2
regardless; in the case of maybe check if we want to force the
instruction to have 4 bytes length.
(nopv): Change it to generate 4 byte long nop as well.
(blockage): New pattern.
(simple_return): Remove call to arc_pad_return.
(p_return_i): Likewise.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262738 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
Backport from mainline
2018-07-13 Richard Biener <rguenther@suse.de>
PR debug/86452
* dwarf2out.c (gen_type_die_with_usage): Use scope_die_for
instead of get_context_die.
2018-07-12 Richard Biener <rguenther@suse.de>
PR c/86453
* c-attribs.c (handle_packed_attribute): Do not build a variant
type with TYPE_PACKED, instead ignore the attribute if we may
not apply to the original type.
* g++.dg/warn/pr86453.C: New testcase.
2018-07-11 Richard Biener <rguenther@suse.de>
PR debug/86457
* dwarf2out.c (init_sections_and_labels): Use
output_asm_line_debug_info consistently.
(dwarf2out_early_finish): Likewise.
(dwarf2out_finish): Remove DW_AT_stmt_list from early generated
type units.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262691 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
Backport from mainline
2018-07-13 Richard Biener <rguenther@suse.de>
PR middle-end/85974
* match.pd (addr1 - addr2): Allow either of the operand to
have a conversion.
* gcc.c-torture/compile/930326-1.c: Adjust to cover widening.
2018-06-15 Richard Biener <rguenther@suse.de>
PR middle-end/86076
* tree-cfg.c (move_stmt_op): unshare invariant addresses
before adjusting their block.
* gcc.dg/pr86076.c: New testcase.
2018-06-06 Richard Biener <rguenther@suse.de>
PR tree-optimization/85935
* graphite-scop-detection.c (find_params_in_bb): Analyze
condition operands with respect to the correct loop. Assert
the analysis doesn't fail.
* gcc.dg/graphite/pr85935.c: New testcase.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262690 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
* tree-ssa-reassoc.c (init_range_entry) <CASE_CONVERT>: Return for a
conversion to a boolean type from a type with greater precision.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262685 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
libstdc++-v3/ChangeLog:
2018-07-16 Andreas Krebbel <krebbel@linux.ibm.com>
* config/abi/post/s390-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262682 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
PR c++/86208
* cp-gimplify.c (cp_genericize_r): When using extern_decl_map, or
in TREE_USED flag from stmt to h->to.
* g++.dg/opt/pr3698.C: New test.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262679 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262674 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
2018-07-15 Bill Schmidt <wschmidt@linux.ibm.com>
Backport from mainline
2018-07-13 Bill Schmidt <wschmidt@linux.ibm.com>
Steve Munroe <munroesj52@gmail.com>
* config/rs6000/emmintrin.h (_mm_and_si128): New function.
(_mm_andnot_si128): Likewise.
(_mm_or_si128): Likewise.
(_mm_xor_si128): Likewise.
[gcc/testsuite]
2018-07-15 Bill Schmidt <wschmidt@linux.ibm.com>
Backport from mainline
2018-07-13 Bill Schmidt <wschmidt@linux.ibm.com>
Steve Munroe <munroesj52@gmail.com>
* gcc.target/powerpc/sse2-pand-1.c: New file.
* gcc.target/powerpc/sse2-pandn-1.c: Likewise.
* gcc.target/powerpc/sse2-por-1.c: Likewise.
* gcc.target/powerpc/sse2-pxor-1.c: Likewise.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262669 138bc75d-0d04-0410-961f-82ee72b054a4
|
|
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-8-branch@262664 138bc75d-0d04-0410-961f-82ee72b054a4
|