Combine avx_vec_concatv16si and avx512f_zero_extendv16hiv16si2_1 to avx512f_zero_exte...
commit95e1eca43d106d821720744ac6ff1f5df41a1e78
authorliuhongt <hongtao.liu@intel.com>
Wed, 11 Aug 2021 06:00:00 +0000 (11 14:00 +0800)
committerliuhongt <hongtao.liu@intel.com>
Thu, 12 Aug 2021 06:02:53 +0000 (12 14:02 +0800)
tree81fc7642880f6f5e696ef59e9615e39d08bc3584
parent21fd62e5ca9967bba8f97fd6244a8c6a564c2146
Combine avx_vec_concatv16si and avx512f_zero_extendv16hiv16si2_1 to avx512f_zero_extendv16hiv16si2_2.

Add define_insn_and_split to combine avx_vec_concatv16si/2 and
avx512f_zero_extendv16hiv16si2_1 since the latter already zero_extend
the upper bits, similar for other patterns which are related to
pmovzx{bw,wd,dq}.

It will do optimization like

-       vmovdqa %ymm0, %ymm0    # 7     [c=4 l=6]  avx_vec_concatv16si/2
        vpmovzxwd       %ymm0, %zmm0    # 22    [c=4 l=6]  avx512f_zero_extendv16hiv16si2
        ret             # 25    [c=0 l=1]  simple_return_internal

gcc/ChangeLog:

PR target/101846
* config/i386/sse.md (*avx2_zero_extendv16qiv16hi2_2): New
post_reload define_insn_and_split.
(*avx512bw_zero_extendv32qiv32hi2_2): Ditto.
(*sse4_1_zero_extendv8qiv8hi2_4): Ditto.
(*avx512f_zero_extendv16hiv16si2_2): Ditto.
(*avx2_zero_extendv8hiv8si2_2): Ditto.
(*sse4_1_zero_extendv4hiv4si2_4): Ditto.
(*avx512f_zero_extendv8siv8di2_2): Ditto.
(*avx2_zero_extendv4siv4di2_2): Ditto.
(*sse4_1_zero_extendv2siv2di2_4): Ditto.
(VI248_256, VI248_512, VI148_512, VI148_256, VI148_128): New
mode iterator.

gcc/testsuite/ChangeLog:

PR target/101846
* gcc.target/i386/pr101846-1.c: New test.
gcc/config/i386/sse.md
gcc/testsuite/gcc.target/i386/pr101846-1.c [new file with mode: 0644]