PowerPC: memset optimization for POWER8/PPC64
commit5f892cacbdf50322bc3ee2e131c105c71b495086
authorAdhemerval Zanella <azanella@linux.vnet.ibm.com>
Tue, 15 Jul 2014 16:19:09 +0000 (15 12:19 -0400)
committerAdhemerval Zanella <azanella@linux.vnet.ibm.com>
Tue, 30 Sep 2014 12:50:31 +0000 (30 07:50 -0500)
treed7acc81fd96b55a4d0e5a458b6e3f2f8712ecf4d
parente6bb56b6914e6435e251814a3a0ccd7fb65a7e36
PowerPC: memset optimization for POWER8/PPC64

This patch adds an optimized memset implementation for POWER8.  For
sizes from 0 to 255 bytes, a word/doubleword algorithm similar to
POWER7 optimized one is used.

For size higher than 255 two strategies are used:

1. If the constant is different than 0, the memory is written with
   altivec vector instruction;

2. If constant is 0, dbcz instructions are used.  The loop is unrolled
   to clear 512 byte at time.

Using vector instructions increases throughput considerable, with a
double performance for sizes larger than 1024.  The dcbz loops unrolls
also shows performance improvement, by doubling throughput for sizes
larger than 8192 bytes.
ChangeLog
benchtests/bench-memset.c
sysdeps/powerpc/powerpc64/multiarch/Makefile
sysdeps/powerpc/powerpc64/multiarch/bzero.c
sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
sysdeps/powerpc/powerpc64/multiarch/memset-power8.S [new file with mode: 0644]
sysdeps/powerpc/powerpc64/multiarch/memset.c
sysdeps/powerpc/powerpc64/power8/memset.S [new file with mode: 0644]