Save and restore vector registers in x86-64 ld.so
commitf3dcae82d54e5097e18e1d6ef4ff55c2ea4e621e
authorH.J. Lu <hjl.tools@gmail.com>
Tue, 25 Aug 2015 11:33:54 +0000 (25 04:33 -0700)
committerH.J. Lu <hjl.tools@gmail.com>
Tue, 25 Aug 2015 11:34:13 +0000 (25 04:34 -0700)
treee43395ded84bf0aa25ecbe2d395082a182aae8b7
parent2d02fd07371bcd492c320cec649c6265787d794a
Save and restore vector registers in x86-64 ld.so

This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve
and _dl_runtime_profile, which save and restore the first 8 vector
registers used for parameter passing.  elf_machine_runtime_setup
selects the proper _dl_runtime_resolve or _dl_runtime_profile based
on _dl_x86_cpu_features.  It avoids race condition caused by
FOREIGN_CALL macros, which are only used for x86-64.

Performance impact of saving and restoring 8 vector registers are
negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when
ld.so is optimized with SSE2.

[BZ #15128]
* sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add
ifuncmain8.
(modules-names): Add ifuncmod8.
($(objpfx)ifuncmain8): New rule.
* sysdeps/x86_64/dl-machine.h: Include <dl-procinfo.h> and
<cpuid.h>.
(elf_machine_runtime_setup): Use _dl_runtime_resolve_sse,
_dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512,
_dl_runtime_profile_sse, _dl_runtime_profile_avx, or
_dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE.
* sysdeps/x86_64/dl-trampoline.S: Rewrite.
* sysdeps/x86_64/dl-trampoline.h: Likewise.
* sysdeps/x86_64/ifuncmain8.c: New file.
* sysdeps/x86_64/ifuncmod8.c: Likewise.
* sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE):
Removed.
* sysdeps/x86_64/nptl/tls.h (__128bits): Removed.
(tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1.
Change rtld_savespace_sse to __glibc_unused2.
(RTLD_CHECK_FOREIGN_CALL): Removed.
(RTLD_ENABLE_FOREIGN_CALL): Likewise.
(RTLD_PREPARE_FOREIGN_CALL): Likewise.
(RTLD_FINALIZE_FOREIGN_CALL): Likewise.
ChangeLog
sysdeps/x86_64/Makefile
sysdeps/x86_64/dl-machine.h
sysdeps/x86_64/dl-trampoline.S
sysdeps/x86_64/dl-trampoline.h
sysdeps/x86_64/ifuncmain8.c [new file with mode: 0644]
sysdeps/x86_64/ifuncmod8.c [new file with mode: 0644]
sysdeps/x86_64/nptl/tcb-offsets.sym
sysdeps/x86_64/nptl/tls.h