aarch64: Fix AdvSIMD libmvec routines for big-endian
commit90a6ca8b28bf34e361e577e526e1b0f4c39a32a5
authorJoe Ramsay <Joe.Ramsay@arm.com>
Thu, 2 May 2024 15:43:13 +0000 (2 16:43 +0100)
committerSzabolcs Nagy <szabolcs.nagy@arm.com>
Tue, 14 May 2024 12:10:33 +0000 (14 13:10 +0100)
tree69830b0b2204a585bcca976208ae412543c19dc1
parentec6ed525f1aa24fd38ea5153e88d14d92d0d2f82
aarch64: Fix AdvSIMD libmvec routines for big-endian

Previously many routines used * to load from vector types stored
in the data table. This is emitted as ldr, which byte-swaps the
entire vector register, and causes bugs for big-endian when not
all lanes contain the same value. When a vector is to be used
this way, it has been replaced with an array and the load with an
explicit ld1 intrinsic, which byte-swaps only within lanes.

As well, many routines previously used non-standard GCC syntax
for vector operations such as indexing into vectors types with []
and assembling vectors using {}. This syntax should not be mixed
with ACLE, as the former does not respect endianness whereas the
latter does. Such examples have been replaced with, for instance,
vcombine_* and vgetq_lane* intrinsics. Helpers which only use the
GCC syntax, such as the v_call helpers, do not need changing as
they do not use intrinsics.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
17 files changed:
sysdeps/aarch64/fpu/asinh_advsimd.c
sysdeps/aarch64/fpu/cosh_advsimd.c
sysdeps/aarch64/fpu/erf_advsimd.c
sysdeps/aarch64/fpu/erfc_advsimd.c
sysdeps/aarch64/fpu/erfcf_advsimd.c
sysdeps/aarch64/fpu/erff_advsimd.c
sysdeps/aarch64/fpu/exp10f_advsimd.c
sysdeps/aarch64/fpu/expm1_advsimd.c
sysdeps/aarch64/fpu/expm1f_advsimd.c
sysdeps/aarch64/fpu/log10_advsimd.c
sysdeps/aarch64/fpu/log2_advsimd.c
sysdeps/aarch64/fpu/log_advsimd.c
sysdeps/aarch64/fpu/sinh_advsimd.c
sysdeps/aarch64/fpu/tan_advsimd.c
sysdeps/aarch64/fpu/tanf_advsimd.c
sysdeps/aarch64/fpu/v_expf_inline.h
sysdeps/aarch64/fpu/v_expm1f_inline.h