Public Git Hosting - glibc.git/commit

commit	3c9980698988ef64072f1fac339b180f52792faf
author	Danila Kutenin <danilak@google.com>
	Mon, 27 Jun 2022 16:12:13 +0000 (27 16:12 +0000)
committer	Szabolcs Nagy <szabolcs.nagy@arm.com>
	Wed, 6 Jul 2022 08:26:20 +0000 (6 09:26 +0100)
tree	3c32dabb3fcbfa564647fcedd9be5c7674a30fc2	tree \| snapshot (tar.gz zip)
parent	bd0b58837c7df091046e7531642f379a52e1e157	commit \| diff

aarch64: Optimize string functions with shrn instruction

We found that string functions were using AND+ADDP
to find the nibble/syndrome mask but there is an easier
opportunity through `SHRN dst.8b, src.8h, 4` (shift
right every 2 bytes by 4 and narrow to 1 byte) and has
same latency on all SIMD ARMv8 targets as ADDP. There
are also possible gaps for memcmp but that's for
another patch.

We see 10-20% savings for small-mid size cases (<=128)
which are primary cases for general workloads.

sysdeps/aarch64/memchr.S		diff \| blob \| blame \| history
sysdeps/aarch64/memrchr.S		diff \| blob \| blame \| history
sysdeps/aarch64/strchrnul.S		diff \| blob \| blame \| history
sysdeps/aarch64/strcpy.S		diff \| blob \| blame \| history
sysdeps/aarch64/strlen.S		diff \| blob \| blame \| history
sysdeps/aarch64/strnlen.S		diff \| blob \| blame \| history

sources.redhat.com git mirror of glibc CVS

RSS Atom