amd64 pipeline: generate much better code for pshufb mm/xmm/ymm. n-i-bz.
commitb078fabb56e34115e55357c319d589b9455dd189
authorJulian Seward <jseward@acm.org>
Sat, 22 Dec 2018 06:23:00 +0000 (22 07:23 +0100)
committerJulian Seward <jseward@acm.org>
Sat, 22 Dec 2018 06:23:00 +0000 (22 07:23 +0100)
treeb965367521fee5bcfafe47961117f3327b651c36
parent6cb6bdbd0a38e9b5f5c4f676afb72a23b6bfb1b5
amd64 pipeline: generate much better code for pshufb mm/xmm/ymm.  n-i-bz.

pshufb mm/xmm/ymm rearranges byte lanes in vector registers.  It's fairly
widely used, but we generated terrible code for it.  With this patch, we just
generate, at the back end, pshufb plus a bit of masking, which is a great
improvement.
12 files changed:
VEX/priv/guest_amd64_toIR.c
VEX/priv/host_amd64_defs.c
VEX/priv/host_amd64_defs.h
VEX/priv/host_amd64_isel.c
VEX/priv/host_generic_simd128.c
VEX/priv/host_generic_simd128.h
VEX/priv/host_generic_simd64.c
VEX/priv/host_generic_simd64.h
VEX/priv/ir_defs.c
VEX/pub/libvex_ir.h
memcheck/mc_translate.c
memcheck/tests/vbit-test/irops.c