RISC-V: Adjust vec unit-stride load/store costs.
Scalar loads provide offset addressing while unit-stride vector
instructions cannot. The offset must be loaded into a general-purpose
register before it can be used. In order to account for this, this
patch adds an address arithmetic heuristic that keeps track of data
reference operands. If we haven't seen the operand before we add the
cost of a scalar statement.
This helps to get rid of an lbm regression when vectorizing (roughly
0.5% fewer dynamic instructions). gcc5 improves by 0.2% and deepsjeng
by 0.25%. wrf and nab degrade by 0.1%. This is because before we now
adjust the cost of SLP as well as loop-vectorized instructions whereas
we would only adjust loop-vectorized instructions before.
Considering higher scalar_to_vec costs (3 vs 1) for all vectorization
types causes some snippets not to get vectorized anymore. Given these
costs the decision looks correct but appears worse when just counting
dynamic instructions.
In total SPECint 2017 has 4 bln dynamic instructions less and SPECfp 0.7
bln.
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Move...
(costs::adjust_stmt_cost): ... to here and add vec_load/vec_store
offset handling.
(costs::add_stmt_cost): Also adjust cost for statements without
stmt_info.
* config/riscv/riscv-vector-costs.h: Define zero constant.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/vse-slp-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vse-slp-2.c: New test.