x86_64: Implement evex512 version of strrchr and wcsrchr
commitfaaf733f49211439475e50f06716b303ee2644bf
authorSunil K Pandey <skpgkp2@gmail.com>
Tue, 9 Aug 2022 14:57:29 +0000 (9 07:57 -0700)
committerSunil K Pandey <skpgkp2@gmail.com>
Thu, 3 Nov 2022 22:51:52 +0000 (3 15:51 -0700)
tree3dd19a3b1d40d31f72cb063a05ad86dc273555d8
parent1f34a2328890aa192141f96449d25b77f666bf47
x86_64: Implement evex512 version of strrchr and wcsrchr

Changes from v1:
  Use vec api for register.
  Replace VPCMP with VPCMPEQ
  Restructure and remove 1 unconditional jump.
  Change page cross logic to use sall.

This patch implements following evex512 version of string functions.
evex512 version takes up to 30% less cycle as compared to evex,
depending on length and alignment.

- strrchr function using 512 bit vectors.
- wcsrchr function using 512 bit vectors.

Code size data:

strrchr-evex.o 879 byte
strrchr-evex512.o 601 byte (-32%)

wcsrchr-evex.o 882 byte
wcsrchr-evex512.o 572 byte (-35%)

Placeholder function, not used by any processor at the moment.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
sysdeps/x86_64/multiarch/Makefile
sysdeps/x86_64/multiarch/ifunc-impl-list.c
sysdeps/x86_64/multiarch/strrchr-evex-base.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/strrchr-evex512.S [new file with mode: 0644]
sysdeps/x86_64/multiarch/wcsrchr-evex512.S [new file with mode: 0644]