arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2
commitcef914e08310166112ac09567e66452a7679bfc8
authorMartin Storsjö <martin@martin.st>
Fri, 1 Feb 2019 09:05:22 +0000 (1 11:05 +0200)
committerMartin Storsjö <martin@martin.st>
Tue, 19 Feb 2019 09:46:18 +0000 (19 11:46 +0200)
tree5e4481bc3c700c05aa8706271672cd34d7abadc1
parente39a9212ab37a55b346801c77487d8a47b6f9fe2
arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2

This makes it similar to put_epel16_v6, and gives a 10-25%
speedup of this function.

Before:                   Cortex A7       A8       A9      A53     A72
vp8_put_epel16_h6v6_neon:    3058.0   2218.5   2459.8   2183.0  1572.2
After:
vp8_put_epel16_h6v6_neon:    2670.8   1934.2   2244.4   1729.4  1503.9

Signed-off-by: Martin Storsjö <martin@martin.st>
libavcodec/arm/vp8dsp_neon.S