s390x: Add optimized chacha20
commit3b56f944c5398114486d6abd60c465682b802072
authorAdhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Thu, 21 Jul 2022 13:05:06 +0000 (21 10:05 -0300)
committerAdhemerval Zanella <adhemerval.zanella@linaro.org>
Fri, 22 Jul 2022 14:58:27 +0000 (22 11:58 -0300)
treee7c0b81f5be27aee611de6dad9fba15325dccb1f
parentb7060acfe8e80fe832e3227020d1127f2d971d1c
s390x: Add optimized chacha20

It adds vectorized ChaCha20 implementation based on libgcrypt
cipher/chacha20-s390x.S.  The final state register clearing is
omitted.

On a z15 it shows the following improvements (using formatted
bench-arc4random data):

GENERIC                                    MB/s
-----------------------------------------------
arc4random [single-thread]               198.92
arc4random_buf(16) [single-thread]       244.49
arc4random_buf(32) [single-thread]       282.73
arc4random_buf(48) [single-thread]       286.64
arc4random_buf(64) [single-thread]       320.06
arc4random_buf(80) [single-thread]       297.43
arc4random_buf(96) [single-thread]       310.96
arc4random_buf(112) [single-thread]      308.10
arc4random_buf(128) [single-thread]      309.90
-----------------------------------------------

VX.                                        MB/s
-----------------------------------------------
arc4random [single-thread]               430.26
arc4random_buf(16) [single-thread]       735.14
arc4random_buf(32) [single-thread]      1029.99
arc4random_buf(48) [single-thread]      1206.76
arc4random_buf(64) [single-thread]      1311.92
arc4random_buf(80) [single-thread]      1378.74
arc4random_buf(96) [single-thread]      1445.06
arc4random_buf(112) [single-thread]     1484.32
arc4random_buf(128) [single-thread]     1517.30
-----------------------------------------------

Checked on s390x-linux-gnu.
LICENSES
sysdeps/s390/s390-64/Makefile
sysdeps/s390/s390-64/chacha20-s390x.S [new file with mode: 0644]
sysdeps/s390/s390-64/chacha20_arch.h [new file with mode: 0644]