i386: Fix blend vector permutation for 8-byte modes
commit57052c6ed59c1a2ee4a67982f960e08593956955
authorUros Bizjak <ubizjak@gmail.com>
Wed, 15 Mar 2023 19:33:48 +0000 (15 20:33 +0100)
committerUros Bizjak <ubizjak@gmail.com>
Wed, 15 Mar 2023 19:35:37 +0000 (15 20:35 +0100)
tree7666451538edb09a30194ba56181216fafd37eba
parent901edd99b44976b3c2b13a7d525d9e315540186a
i386: Fix blend vector permutation for 8-byte modes

8-byte modes should be processed only for TARGET_MMX_WITH_SSE. Handle
V2SFmode and fix V2HImode handling. The resulting BLEND instructions
are always faster than MOVSS/MOVSD, so prioritize them w.r.t MOVSS/MOVSD
for TARGET_SSE4_1.

gcc/ChangeLog:

* config/i386/i386-expand.cc (expand_vec_perm_blend):
Handle 8-byte modes only with TARGET_MMX_WITH_SSE. Handle V2SFmode
and fix V2HImode handling.
(expand_vec_perm_1): Try to emit BLEND instruction
before MOVSS/MOVSD.
* config/i386/mmx.md (*mmx_blendps): New insn pattern.

gcc/testsuite/ChangeLog:

* gcc.target/i386/merge-1.c (dg-options): Use -mno-sse4.
* gcc.target/i386/sse2-mmx-21.c (dg-options): Ditto.
* gcc.target/i386/sse-movss-4.c (dg-options):
Use -mno-sse4.  Simplify scan-assembler-not strings.
* gcc.target/i386/sse2-movsd-3.c (dg-options): Ditto.
* gcc.target/i386/sse2-mmx-movss-1.c: New test.
gcc/config/i386/i386-expand.cc
gcc/config/i386/mmx.md
gcc/testsuite/gcc.target/i386/merge-1.c
gcc/testsuite/gcc.target/i386/sse-movss-4.c
gcc/testsuite/gcc.target/i386/sse2-mmx-21.c
gcc/testsuite/gcc.target/i386/sse2-mmx-movss-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/sse2-movsd-3.c