i386: Improve permutations with INSERTPS instruction [PR94908]
commit95b99e47f4f2df2d0c5680f45e3ec0a3170218ad
authorUros Bizjak <ubizjak@gmail.com>
Tue, 18 Apr 2023 15:50:37 +0000 (18 17:50 +0200)
committerUros Bizjak <ubizjak@gmail.com>
Tue, 18 Apr 2023 16:58:51 +0000 (18 18:58 +0200)
treebdd0c598480e642b8ab7fe77e5c6de2c9b419614
parent6067ae4557a3a7e5b08359e78a29b8a9d5dfedce
i386: Improve permutations with INSERTPS instruction [PR94908]

INSERTPS can select any element from src and insert into any place
of the dest.  For SSE4.1 targets, compiler can generate e.g.

insertps $64, %xmm0, %xmm1

to insert element 1 from %xmm1 to element 0 of %xmm0.

gcc/ChangeLog:

PR target/94908
* config/i386/i386-builtin.def (__builtin_ia32_insertps128):
Use CODE_FOR_sse4_1_insertps_v4sf.
* config/i386/i386-expand.cc (expand_vec_perm_insertps): New.
(expand_vec_perm_1): Call expand_vec_per_insertps.
* config/i386/i386.md ("unspec"): Declare UNSPEC_INSERTPS here.
* config/i386/mmx.md (mmxscalarmode): New mode attribute.
(@sse4_1_insertps_<mode>): New insn pattern.
* config/i386/sse.md (@sse4_1_insertps_<mode>): Macroize insn
pattern from sse4_1_insertps using VI4F_128 mode iterator.

gcc/testsuite/ChangeLog:

PR target/94908
* gcc.target/i386/pr94908.c: New test.
* gcc.target/i386/sse4_1-insertps-5.c: New test.
* gcc.target/i386/vperm-v4sf-2-sse4.c: New test.
gcc/config/i386/i386-builtin.def
gcc/config/i386/i386-expand.cc
gcc/config/i386/i386.md
gcc/config/i386/mmx.md
gcc/config/i386/sse.md
gcc/testsuite/gcc.target/i386/pr94908.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/sse4_1-insertps-5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/vperm-v4sf-2-sse4.c [new file with mode: 0644]