[SLP/AArch64] Fix unpack handling for big-endian SVE
commit88e81b08ba556d74f4448e9b84d8f455e4b2a973
authorrsandifo <rsandifo@138bc75d-0d04-0410-961f-82ee72b054a4>
Tue, 13 Mar 2018 15:13:37 +0000 (13 15:13 +0000)
committerrsandifo <rsandifo@138bc75d-0d04-0410-961f-82ee72b054a4>
Tue, 13 Mar 2018 15:13:37 +0000 (13 15:13 +0000)
tree441543123e7d06028b9fbd87da8bb03bbcb2837a
parent62b3b99903c17fc822866e5368bf13acbc931845
[SLP/AArch64] Fix unpack handling for big-endian SVE

I hadn't realised that on big-endian targets, VEC_UNPACK*HI_EXPR unpacks
the low-numbered lanes and VEC_UNPACK*LO_EXPR unpacks the high-numbered
lanes.  This meant that both the SVE patterns and the handling of
fully-masked loops were wrong.

The patch deals with that by making sure that all vec_unpack* optabs
are define_expands, using BYTES_BIG_ENDIAN to choose the appropriate
define_insn.  This in turn meant that we can get rid of the duplication
between the signed and unsigned patterns for predicates.  (We provide
implementations of both the signed and unsigned optabs because the sign
doesn't matter for predicates: every element contains only one
significant bit.)

Also, the float unpacks need to unpack one half of the input vector,
but the unpacked upper bits are "don't care".  There are two obvious
ways of handling that: use an unpack (filling with zeros) or use a ZIP
(filling with a duplicate of the low bits).  The code previously used
unpacks, but the sequence involved a subreg that is semantically an
element reverse on big-endian targets.  Using the ZIP patterns avoids
that, and at the moment there's no reason to prefer one over the other
for performance reasons, so the patch switches to ZIP unconditionally.
As the comment says, it would be easy to optimise this later if UUNPK
turns out to be better for some implementations.

2018-03-13  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
* tree-vect-loop-manip.c (vect_maybe_permute_loop_masks):
Reverse the choice between VEC_UNPACK_LO_EXPR and VEC_UNPACK_HI_EXPR
for big-endian.
* config/aarch64/iterators.md (hi_lanes_optab): New int attribute.
* config/aarch64/aarch64-sve.md
(*aarch64_sve_<perm_insn><perm_hilo><mode>): Rename to...
(aarch64_sve_<perm_insn><perm_hilo><mode>): ...this.
(*extend<mode><Vwide>2): Rename to...
(aarch64_sve_extend<mode><Vwide>2): ...this.
(vec_unpack<su>_<perm_hilo>_<mode>): Turn into a define_expand,
renaming the old pattern to...
(aarch64_sve_punpk<perm_hilo>_<mode>): ...this.  Only define
unsigned packs.
(vec_unpack<su>_<perm_hilo>_<SVE_BHSI:mode>): Turn into a
define_expand, renaming the old pattern to...
(aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): ...this.
(*vec_unpacku_<perm_hilo>_<mode>_no_convert): Delete.
(vec_unpacks_<perm_hilo>_<mode>): Take BYTES_BIG_ENDIAN into
account when deciding which SVE instruction the optab should use.
(vec_unpack<su_optab>_float_<perm_hilo>_vnx4si): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Expect zips rather
than unpacks.
* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Likewise.
* gcc.target/aarch64/sve/unpack_float_1.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@258489 138bc75d-0d04-0410-961f-82ee72b054a4
gcc/ChangeLog
gcc/config/aarch64/aarch64-sve.md
gcc/config/aarch64/iterators.md
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_signed_1.c
gcc/testsuite/gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c
gcc/testsuite/gcc.target/aarch64/sve/unpack_float_1.c
gcc/tree-vect-loop-manip.c