Code scheduling for Cortex-A53 isn't as good as it could be. It turns out
commit441e81343971e3a79b56d913309d2042799dac4b
authorwilco <wilco@138bc75d-0d04-0410-961f-82ee72b054a4>
Fri, 5 May 2017 09:40:01 +0000 (5 09:40 +0000)
committerwilco <wilco@138bc75d-0d04-0410-961f-82ee72b054a4>
Fri, 5 May 2017 09:40:01 +0000 (5 09:40 +0000)
tree8e1a45a11552c8773d1770494a16e028b354522b
parentadd0a8db807a56095cd6f2a2d919f2ce607e2529
Code scheduling for Cortex-A53 isn't as good as it could be.  It turns out
code runs faster overall if we place loads and stores with a dependency
closer together.  To achieve this effect, this patch adds a bypass between
cortex_a53_load1 and cortex_a53_load*/cortex_a53_store* if the result of an
earlier load is used in an address calculation.  This significantly improved
benchmark scores in a proprietary benchmark suite.

    gcc/
* config/arm/aarch-common.c (arm_early_load_addr_dep_ptr):
New function.
(arm_early_store_addr_dep_ptr): Likewise.
* config/arm/aarch-common-protos.h
(arm_early_load_addr_dep_ptr): Add prototype.
(arm_early_store_addr_dep_ptr): Likewise.
* config/arm/cortex-a53.md: Add new bypasses.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@247631 138bc75d-0d04-0410-961f-82ee72b054a4
gcc/ChangeLog
gcc/config/arm/aarch-common-protos.h
gcc/config/arm/aarch-common.c
gcc/config/arm/cortex-a53.md