PR96463: Optimise svld1rq from vectors for little endian AArch64 targets.
commit494bec025002df422f2faa947138bf3643d80b54
authorPrathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
Sun, 12 Jun 2022 03:20:16 +0000 (12 08:50 +0530)
committerPrathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
Sun, 12 Jun 2022 03:25:04 +0000 (12 08:55 +0530)
treee6c5ba8e07f688100d879290d8b2f3ad929e34e4
parentcbd842717ec5cab989141bf1575846c2acef818d
PR96463: Optimise svld1rq from vectors for little endian AArch64 targets.

The patch folds:
lhs = svld1rq({-1, -1, ...}, rhs)
into:
tmp = mem_ref<vectype> [(elem_type * {ref-all}) rhs]
lhs = vec_perm_expr<tmp, tmp, {0, 1, 2, 3 ...}>.
which is then expanded using aarch64_expand_sve_dupq.

Example:

svint32_t
foo (int32x4_t x)
{
  return svld1rq (svptrue_b8 (), &x[0]);
}

code-gen:
foo:
.LFB4350:
dup     z0.q, z0.q[0]
ret

The patch relaxes type-checking for VEC_PERM_EXPR by allowing different
vector types for lhs and rhs provided:
(1) rhs3 is constant and has integer type element.
(2) len(lhs) == len(rhs3) and len(rhs1) == len(rhs2)
(3) lhs and rhs have same element type.

gcc/ChangeLog:
PR target/96463
* config/aarch64/aarch64-sve-builtins-base.cc: Include ssa.h.
(svld1rq_impl::fold): Define.
* config/aarch64/aarch64.cc (expand_vec_perm_d): Define new members
op_mode and op_vec_flags.
(aarch64_evpc_reencode): Initialize newd.op_mode and
newd.op_vec_flags.
(aarch64_evpc_sve_dup): New function.
(aarch64_expand_vec_perm_const_1): Gate existing calls to
aarch64_evpc_* functions under d->vmode == d->op_mode,
and call aarch64_evpc_sve_dup.
(aarch64_vectorize_vec_perm_const): Remove assert
d->vmode != d->op_mode, and initialize d.op_mode and d.op_vec_flags.
* tree-cfg.cc (verify_gimple_assign_ternary): Allow different
vector types for lhs and rhs in VEC_PERM_EXPR if rhs3 is
constant.

gcc/testsuite/ChangeLog:
PR target/96463
* gcc.target/aarch64/sve/acle/general/pr96463-1.c: New test.
* gcc.target/aarch64/sve/acle/general/pr96463-2.c: Likewise.
gcc/config/aarch64/aarch64-sve-builtins-base.cc
gcc/config/aarch64/aarch64.cc
gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr96463-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr96463-2.c [new file with mode: 0644]
gcc/tree-cfg.cc