Public Git Hosting - glibc.git/log - sysdeps/unix/sysv/linux/i386/fallocate64.c

locale: Handle loading a missing locale twice (Bug 14247)

Delay setting file->decided until the data has been successfully loaded
by _nl_load_locale().  If the function fails to load the data then we
must return and error and leave decided untouched to allow the caller to
attempt to load the data again at a later time.  We should not set
decided to 1 early in the function since doing so may prevent attempting
to load it again. We want to try loading it again because that allows an
open to fail and set errno correctly.

On the other side of this problem is that if we are called again with
the same inputs we will fetch the cached version of the object and carry
out no open syscalls and that fails to set errno so we must set errno to
ENOENT in that case.  There is a second code path that has to be handled
where the name of the locale matches but the codeset doesn't match.

These changes ensure that errno is correctly set on failure in all the
return paths in _nl_find_locale().

Adds tst-locale-loadlocale to cover the bug.

No regressions on x86_64.

Co-authored-by: Jeff Law <law@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

elf: Do not check for loader mmap on tst-decorate-maps (BZ 31553)

On some architectures and depending on the page size, the loader can
also allocate some memory during dependencies loading and it will be
marked as 'loader malloc'. However, if the system page size is
large enough, the initial data page will be enough for all required
allocation and there will be no extra loader mmap. To avoid false
negatives, the test does not check for such pages.

Checked on powerpc64le-linux-gnu with 64k pagesize.
Reviewed-by: Simon Chopin <simon.chopin@canonical.com>

Use --enable-obsolete in build-many-glibcs.py for nios2-linux-gnu

Until GCC removes Nios II support (at which point we should do so as
well), this is now needed for GCC 14 / mainline to build for
nios2-linux-gnu target.

Tested with build-many-glibcs.py (GCC mainline) for nios2-linux-gnu.

login: Use unsigned 32-bit types for seconds-since-epoch

These fields store timestamps when the system was running. No Linux
systems existed before 1970, so these values are unused. Switching
to unsigned types allows continued use of the existing struct layouts
beyond the year 2038.

The intent is to give distributions more time to switch to improved
interfaces that also avoid locking/data corruption issues.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

login: structs utmp, utmpx, lastlog _TIME_BITS independence (bug 30701)

These structs describe file formats under /var/log, and should not
depend on the definition of _TIME_BITS. This is achieved by
defining __WORDSIZE_TIME64_COMPAT32 to 1 on 32-bit ports that
support 32-bit time_t values (where __time_t is 32 bits).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

login: Check default sizes of structs utmp, utmpx, lastlog

The default <utmp-size.h> is for ports with a 64-bit time_t.
Ports with a 32-bit time_t or with __WORDSIZE_TIME64_COMPAT32=1
need to override it.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

benchtests: Add random() benchmark

Add a simple benchmark to measure the overhead of internal libc locks in
the random() implementation on both single- and multi-threaded cases.
This relies on the implementation of random using internal locks to
access shared global data, and that the runtime uses multi-threaded
locking once a thread has been created (even after it finishes).

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

advisories: Add Reported-By

Add a new tag to give credit to vulnerability discoverers.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

Fix 'Reported-By' to use Camel Case for commit 6a98f4640ea453f

Document CVE-2024-2961

This commit adds "advisories" entries for the above three CVEs.

iconv: ISO-2022-CN-EXT: fix out-of-bound writes when writing escape sequence (CVE-2024-2961)

ISO-2022-CN-EXT uses escape sequences to indicate character set changes
(as specified by RFC 1922). While the SOdesignation has the expected
bounds checks, neither SS2designation nor SS3designation have its;
allowing a write overflow of 1, 2, or 3 bytes with fixed values:
'$+I', '$+J', '$+K', '$+L', '$+M', or '$*H'.

Checked on aarch64-linux-gnu.

Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>

elf/rtld: Count skipped environment variables for enable_secure

When using the glibc.rtld.enable_secure tunable we need to keep track of
the count of environment variables we skip due to __libc_enable_secure
being set and adjust the auxv section of the stack. This fixes an
assertion when running ld.so directly with glibc.rtld.enable_secure set.
Add a testcase that ensures the assert is not hit.

elf/rtld.c:1324 assert (auxv == sp + 1);

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

powerpc: Fix ld.so address determination for PCREL mode (bug 31640)

This seems to have stopped working with some GCC 14 versions,
which clobber r2. With other compilers, the kernel-provided
r2 value is still available at this point.

Reviewed-by: Peter Bergner <bergner@linux.ibm.com>

Revert "x86_64: Suppress false positive valgrind error"

This reverts commit a1735e0aa858f0c8b15e5ee9975bff4279423680.

The test failure is a real valgrind bug that needs to be fixed before
valgrind is usable with a glibc that has been built with
CC="gcc -march=x86-64-v3". The proposed valgrind patch teaches
valgrind to replace ld.so strcmp with an unoptimized scalar
implementation, thus avoiding any AVX2-related problems.

Valgrind bug: <https://bugs.kde.org/show_bug.cgi?id=485487>

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

wcsmbs: Ensure wcstr worst-case linear execution time (BZ 23865)

It uses the same two-way algorithm used on strstr, strcasestr, and
memmem. Different than strstr, neither the "shift table" optimization
nor the self-adapting filtering check is used because it would result in
a too-large shift table (and it also simplifies the implementation bit).

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>

wcsmbs: Add test-wcsstr

Parametrize test-strstr.c so it can be used to check wcsstr.

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Reviewed-by: DJ Delorie <dj@redhat.com>

posix: Sync tempname with gnulib

The gnulib version contains an important change (9ce573cde), which
fixes some problems with multithreading, entropy loss, and ASLR leak
nfo.  It also fixes an issue where getrandom is not being used
on some new files generation (only for __GT_NOCREATE on first try).

The 044bf893ac removed __path_search, which is now moved to another
gnulib shared files (stdio-common/tmpdir.{c,h}).  Tthis patch
also fixes direxists to use __stat64_time64 instead of __xstat64,
and move the include of pathmax.h for !_LIBC (since it is not used
by glibc).  The license is also changed from GPL 3.0 to 2.1, with
permission from the authors (Bruno Haible and Paul Eggert).

The sync also removed the clock fallback, since clock_gettime
with CLOCK_REALTIME is expected to always succeed.

It syncs with gnulib commit 323834962817af7b115187e8c9a833437f8d20ec.

Checked on x86_64-linux-gnu.

Co-authored-by: Bruno Haible <bruno@clisp.org>
Co-authored-by: Paul Eggert <eggert@cs.ucla.edu>
Reviewed-by: Bruno Haible <bruno@clisp.org>

socket: Add new test for connect

This commit adds a simple bind/accept/connect test for an IPv4 TCP
connection to a local process via the loopback interface.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

libsupport: Add xgetpeername

The patch adds redirections for getpeername.

Reviewed-by: Arjun Shankar <arjun@redhat.com>

nptl: Add tst-pthread-key1-static for BZ #21777

Add a static pthread static tests to verify that BZ #21777 is fixed.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

elf: Add ld.so test with non-existing program name

None of the existing tests seem to cover the case where
_dl_signal_error is called without an active error handler.
The new elf/tst-rtld-does-not-exist test triggers such a
_dl_signal_error call from _dl_map_object.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Check objname before calling fatal_error

_dl_signal_error may be called with objname == NULL.  _dl_exception_create
checks objname == NULL.  But fatal_error doesn't.  Check objname before
calling fatal_error.  This fixes BZ #31596.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

Use crtbeginT.o and crtend.o for non-PIE static executables

When static PIE is enabled by default, we shouldn't use crtbeginS.o and
crtendS.o for non-PIE static executables. Check $($(@F)-no-pie) to use
crtbeginT.o and crtend.o to create non-PIE static executables.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

aarch64: Enhanced CPU diagnostics for ld.so

This prints some information from struct cpu_features, and the midr_el1
and dczid_el0 system register contents on every CPU.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

x86: Add generic CPUID data dumper to ld.so --list-diagnostics

This is surprisingly difficult to implement if the goal is to produce
reasonably sized output. With the current approaches to output
compression (suppressing zeros and repeated results between CPUs,
folding ranges of identical subleaves, dealing with the %ecx
reflection issue), the output is less than 600 KiB even for systems
with 256 logical CPUs.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

elf: Add CPU iteration support for future use in ld.so diagnostics

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

timezone: sync to TZDB 2024a

Sync tzselect, zdump, zic to TZDB 2024a.
This patch incorporates the following TZDB source code changes,
listed roughly in descending order of importance.

  zic now supports links to links, needed for future tzdata
  zic now defaults to '-b slim'
  zic now updates output files atomically
  zic has new options -R, -l -, -p -
  zic -r now uses -00 for unspecified timestamps
  zdump now uses [lo,hi) for both -c and -t
  Fix several integer overflow bugs
  zic now checks input bytes more carefully
  Simplify and fix new TZDIR setup
  Default time_t to 64 bits on glibc 2.34+ 32-bit
  zic now generates TZ strings that conform to POSIX when all-year DST
  zic -v now shows extreme-int tm_year transitions
  Fix zic bug in last time type of Asia/Gaza etc.
  Fix zic bug with Palestine after 2075
  Fix bug uncovered by recent change to Iran history
  Fix 'zic -b fat' bug with Port Moresby 32-bit data
  Fix zic bug with -r @X where X is deduced from TZ
  Fix bug with zic -r cutoff before 1st transition
  Fix leap second expiry and truncation
  Fix zic bug on Linux 2.6.16 and 2.6.17
  Fix bug with 'zic -d /a/b/c' if /a is unwriteable
  Don't mistruncate TZif files at leap seconds
  Fix zdump undefined behavior if !USE_LTZ
  zdump -v reports localtime+gmtime failures better
  Fix zdump diagnostic for missing timezone
  Don't assume nonempty argv
  Port better to C23
  Do not assume negative >> behavior
  I18nize zdump a bit better
  Port zdump to right_only installations
  New tzselect menu option 'now'
  tzselect can now use current time to help choose
  Improve tzselect behavior for Turkey etc.
  tzselect: do not create temporary files
  tzselect: work around mawk bug with {2,}
  tzselect: Port to POSIX awk, which prohibits -v newlines
  Do not use empty RE in tzselect
  Don't set TZ in tzselect
  Avoid sed, expr in tzselect
  tzselect: Fix problems with spaces in TZDIR
  Improve tzselect diagnostics
  Remove zic workaround for Qt bug 53071
  Remove zic support for "min" in Rule lines
  Remove zic support for zic -y, Rule TYPEs, pacificnew
  Remove tzselect workaround for Bash 1.14.7 bug

* SHARED-FILES: Update to match current sync.
* config.h.in (HAVE_STRERROR): Remove; no longer needed.
* timezone/Makefile ($(objpfx)zic.o): Depend on tzdir.h.
($(objpfx)tzdir.h): New rule to build a placeholder.
* timezone/private.h, timezone/tzfile.h, timezone/version:
* timezone/zdump.c, timezone/zic.c: Copy verbatim from TZDB 2024a.

Fix bsearch, qsort doc to match POSIX better

* manual/search.texi (Array Search Function):
Correct the statement about lfind’s mean runtime:
it is proportional to a number (not that number),
and this is true only if random elements are searched for.
Relax the constraint on bsearch’s array argument:
POSIX says it need not be sorted, only partially sorted.
Say that the first arg passed to bsearch’s comparison function
is the key, and the second arg is an array element, as
POSIX requires. For bsearch and qsort, say that the
comparison function should not alter the array, as POSIX
requires. For qsort, say that the comparison function
must define a total order, as POSIX requires, that
it should not depend on element addresses, that
the original array index can be used for stable sorts,
and that if qsort still works if memory allocation fails.
Be more consistent in calling the array elements
“elements” rather than “objects”.

Co-authored-by: Zack Weinberg <zack@owlfolio.org>

x86-64: Exclude FMA4 IFUNC functions for -mapxf

When -mapxf is used to build glibc, the resulting glibc will never run
on FMA4 machines. Exclude FMA4 IFUNC functions when -mapxf is used.
This requires GCC which defines __APX_F__ for -mapxf with commit:

1df56719bd8 x86: Define __APX_F__ for -mapxf

Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

Reinstate generic features-time64.h

The a4ed0471d7 removed the generic version which is included by
features.h and used by Hurd.

Checked by building i686-gnu and x86_64-gnu with build-many-glibc.py.

Cleanup __tls_get_addr on alpha/microblaze localplt.data

They are not required.

Checked with a make check for both ABIs.

arm: Remove ld.so __tls_get_addr plt usage

Use the hidden alias instead.

Checked on arm-linux-gnueabihf.

aarch64: Remove ld.so __tls_get_addr plt usage

Use the hidden alias instead.

Checked on aarch64-linux-gnu.

math: x86 trunc traps when FE_INEXACT is enabled (BZ 31603)

The implementations of trunc functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.

The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

math: x86 floor traps when FE_INEXACT is enabled (BZ 31601)

The implementations of floor functions using x87 floating point (i386 and
86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.

The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

math: x86 ceill traps when FE_INEXACT is enabled (BZ 31600)

The implementations of ceil functions using x87 floating point (i386 and
x86_64 long double only) traps when FE_INEXACT is enabled. Although
this is a GNU extension outside the scope of the C standard, other
architectures that also support traps do not show this behavior.

The fix moves the implementation to a common one that holds any
exceptions with a 'fnclex' (libc_feholdexcept_setround_387).

Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

aarch64/fpu: Add vector variants of erfc

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

aarch64/fpu: Add vector variants of tanh

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

aarch64/fpu: Add vector variants of sinh

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

aarch64/fpu: Add vector variants of atanh

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

aarch64/fpu: Add vector variants of asinh

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

aarch64/fpu: Add vector variants of acosh

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

aarch64/fpu: Add vector variants of cosh

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

aarch64/fpu: Add vector variants of erf

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

misc: Add support for Linux uio.h RWF_NOAPPEND flag

In Linux 6.9 a new flag is added to allow for Per-io operations to
disable append mode even if a file was opened with the flag O_APPEND.
This is done with the new RWF_NOAPPEND flag.

This caused two test failures as these tests expected the flag 0x00000020
to be unused.  Adding the flag definition now fixes these tests on Linux
6.9 (v6.9-rc1).

  FAIL: misc/tst-preadvwritev2
  FAIL: misc/tst-preadvwritev64v2

This patch adds the flag, adjusts the test and adds details to
documentation.

Link: https://lore.kernel.org/all/20200831153207.GO3265@brightrain.aerifal.cx/
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

manual: significand() uses FLT_RADIX, not 2

It's implemented using scalb(), which uses FLT_RADIX, AFAIK.

Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#mf21ab57e16b92eb6be6c7df79dc0eb43d4454056>
Reported-by: Morten Welinder <mwelinder@gmail.com>
Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: Vincent Lefevre <vincent@vinc17.net>
Cc: DJ Delorie <dj@redhat.com>
Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: DJ Delorie <dj@redhat.com>

manual: Clarify return value of cbrt(3)

Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#mff0ab388000c6afdb5e5162804d4a0073de481de>
Reported-by: Morten Welinder <mwelinder@gmail.com>
Cowritten-by: Morten Welinder <mwelinder@gmail.com>
Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: Vincent Lefevre <vincent@vinc17.net>
Cc: DJ Delorie <dj@redhat.com>
Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: DJ Delorie <dj@redhat.com>

manual: floor(log2(fabs(x))) has rounding errors

Link: <https://inbox.sourceware.org/libc-alpha/20240305150131.GD3653@qaa.vinc17.org/T/#m3ceecda630012995339bcc5448fee451cf277a8b>
Reported-by: Vincent Lefevre <vincent@vinc17.net>
Suggested-by: Vincent Lefevre <vincent@vinc17.net>
Reviewed-by: DJ Delorie <dj@redhat.com>
Cc: Morten Welinder <mwelinder@gmail.com>
Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>

manual: logb(x) is floor(log2(fabs(x)))

log2(3) doesn't accept negative input, but it seems logb(3) does accept
it.

Link: <https://lore.kernel.org/linux-man/ZeYKUOKYS7G90SaV@debian/T/#u>
Reported-by: Morten Welinder <mwelinder@gmail.com>
Reviewed-by: DJ Delorie <dj@redhat.com>
Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
Cc: Vincent Lefevre <vincent@vinc17.net>
Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>

powerpc: Add missing arch flags on rounding ifunc variants

The ifunc variants now uses the powerpc implementation which in turn
uses the compiler builtin. Without the proper -mcpu switch the builtin
does not generate the expected optimization.

Checked on powerpc-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>

math: Reformat Makefile.

Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.

No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

Always define __USE_TIME_BITS64 when 64 bit time_t is used

It was raised on libc-help [1] that some Linux kernel interfaces expect
the libc to define __USE_TIME_BITS64 to indicate the time_t size for the
kABI. Different than defined by the initial y2038 design document [2],
the __USE_TIME_BITS64 is only defined for ABIs that support more than
one time_t size (by defining the _TIME_BITS for each module).

The 64 bit time_t redirects are now enabled using a different internal
define (__USE_TIME64_REDIRECTS). There is no expected change in semantic
or code generation.

Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu, and
arm-linux-gnueabi

[1] https://sourceware.org/pipermail/libc-help/2024-January/006557.html
[2] https://sourceware.org/glibc/wiki/Y2038ProofnessDesign

Reviewed-by: DJ Delorie <dj@redhat.com>

benchtests: Improve benchtests for strstr

Use same strategy as bench-strstr.c (93eebae5168e5cf2 and 80b2bfb53504)
and use json_ctx for output to help standardize format across all
benchtests.
Reviewed-by: Arjun Shankar <arjun@redhat.com>

x86_64: Remove avx512 strstr implementation

As indicated in a recent thread, this it is a simple brute-force
algorithm that checks the whole needle at a matching character pair
(and does so 1 byte at a time after the first 64 bytes of a needle).
Also it never skips ahead and thus can match at every haystack
position after trying to match all of the needle, which generic
implementation avoids.

As indicated by Wilco, a 4x larger needle and 16x larger haystack gives
a clear 65x slowdown both basic_strstr and __strstr_avx512:

  "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512",
"__strstr_sse2_unaligned", "__strstr_generic"],

    {
     "len_haystack": 65536,
     "len_needle": 1024,
     "align_haystack": 0,
     "align_needle": 0,
     "fail": 1,
     "desc": "Difficult bruteforce needle",
     "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2]
    },
    {
     "len_haystack": 1048576,
     "len_needle": 4096,
     "align_haystack": 0,
     "align_needle": 0,
     "fail": 1,
     "desc": "Difficult bruteforce needle",
     "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9]
    }

PS: I don't have an AVX512 capable machine to verify this issues, but
    skimming through the code it does seems to follow what Wilco has
    described.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>

signal: Avoid system signal disposition to interfere with tests

Both tst-sigset2 and tst-signal1 expectes that SIGINT disposition
is set to SIG_DFL.

RISC-V: Fix the static-PIE non-relocated object check

The value of l_scope is only valid post relocation, so this original
check was triggering undefined behavior. Instead just directly check to
see if the object has been relocated, at which point using l_scope is
safe.

Reported-by: Andreas Schwab <schwab@suse.de>
Closes: BZ #31317
Fixes: e0590f41fe ("RISC-V: Enable static-pie.")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

htl: Implement some support for TLS_DTV_AT_TP

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-19-bugaevc@gmail.com>

htl: Respect GL(dl_stack_flags) when allocating stacks

Previously, HTL would always allocate non-executable stacks. This has
never been noticed, since GNU Mach on x86 ignores VM_PROT_EXECUTE and
makes all pages implicitly executable. Since GNU Mach on AArch64
supports non-executable pages, HTL forgetting to pass VM_PROT_EXECUTE
immediately breaks any code that (unfortunately, still) relies on
executable stacks.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-7-bugaevc@gmail.com>

hurd: Use the RETURN_ADDRESS macro

This gives us PAC stripping on AArch64.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-6-bugaevc@gmail.com>

hurd: Disable Prefer_MAP_32BIT_EXEC on non-x86_64 for now

While we could support it on any architecture, the tunable is currently
only defined on x86_64.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-5-bugaevc@gmail.com>

Allow glibc to be compiled without EXEC_PAGESIZE

We would like to avoid statically defining any specific page size on
aarch64-gnu, and instead make sure that everything uses the dynamic
page size, available via vm_page_size and GLRO(dl_pagesize).

There are currently a few places in glibc that require EXEC_PAGESIZE
to be defined. Per Roland's suggestion [0], drop the static
GLRO(dl_pagesize) initializers (for now, only if EXEC_PAGESIZE is not
defined), and don't require EXEC_PAGESIZE definition for libio to
enable mmap usage.

[0]: https://mail.gnu.org/archive/html/bug-hurd/2011-10/msg00035.html

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-4-bugaevc@gmail.com>

hurd: Stop relying on VM_MAX_ADDRESS

We'd like to avoid committing to a specific size of virtual address
space (i.e. the value of VM_AARCH64_T0SZ) on AArch64.  While the current
version of GNU Mach still exports VM_MAX_ADDRESS for compatibility, we
should try to avoid relying on it when we can.  This piece of logic in
_hurdsig_getenv () doesn't actually care about the size of user-
accessible virtual address space, it just wants to preempt faults on any
addresses starting from the value of the P pointer and above.  So, use
(unsigned long int) -1 instead of VM_MAX_ADDRESS.

While at it, change the casts to (unsigned long int) and not just
(long int), since the type of struct hurd_signal_preemptor.{first,last}
is unsigned long int.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-3-bugaevc@gmail.com>

hurd: Move internal functions to internal header

Move _hurd_self_sigstate (), _hurd_critical_section_lock (), and
_hurd_critical_section_unlock () inline implementations (that were
already guarded by #if defined _LIBC) to the internal version of the
header. While at it, add <tls.h> to the includes, and use
__LIBC_NO_TLS () unconditionally.

Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
Message-ID: <20240323173301.151066-2-bugaevc@gmail.com>

stdlib: Fix tst-makecontext2 log when swapcontext fails

The log incorrectly prints, setcontext failed. Update this to indicate
that actually swapcontext failed.

Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

or1k: Add prctl wrapper to unwrap variadic args

On OpenRISC variadic functions and regular functions have different
calling conventions so this wrapper is needed to translate.  This
wrapper is copied from x86_64/x32.  I don't know the build system enough
to find a cleaner way to share the code between x86_64/x32 and or1k
(maybe Implies?), so I went with the straight copy.

This fixes test failures:

  misc/tst-prctl
  nptl/tst-setgetname

or1k: Only define fpu rouding and exceptions with hard-float

This test failure:

  math/test-fenv

If rounding mode and exception macros are defined then the fenv tests
run and always fail.  This patch adds an ifdef using the
__or1k_hard_float__ macro provided by gcc to avoid defining these fenv
macros when they cnnot be used.  This is similar to what is done in csky.

Note, I will post the or1k hard-float support soon. So, I prefer to
leave the hard-float bits here for now.

or1k: Update libm test ulps

To fix test failures:

FAIL: math/test-float-hypot
FAIL: math/test-float32-hypot

AArch64: Check kernel version for SVE ifuncs

Old Linux kernels disable SVE after every system call.  Calling the
SVE-optimized memcpy afterwards will then cause a trap to reenable SVE.
As a result, applications with a high use of syscalls may run slower with
the SVE memcpy.  This is true for kernels between 4.15.0 and before 6.2.0,
except for 5.14.0 which was patched.  Avoid this by checking the kernel
version and selecting the SVE ifunc on modern kernels.

Parse the kernel version reported by uname() into a 24-bit kernel.major.minor
value without calling any library functions.  If uname() is not supported or
if the version format is not recognized, assume the kernel is modern.

Tested-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

powerpc: Placeholder and infrastructure/build support to add Power11 related changes.

The following three changes have been added to provide initial Power11 support.
    1. Add the directories to hold Power11 files.
    2. Add support to select Power11 libraries based on AT_PLATFORM.
    3. Let submachine=power11 be set automatically.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>

powerpc: Add HWCAP3/HWCAP4 data to TCB for Power Architecture.

This patch adds a new feature for powerpc. In order to get faster
access to the HWCAP3/HWCAP4 masks, similar to HWCAP/HWCAP2 (i.e. for
implementing __builtin_cpu_supports() in GCC) without the overhead of
reading them from the auxiliary vector, we now reserve space for them
in the TCB.

This is an ABI change for GLIBC 2.39.

Suggested-by: Peter Bergner <bergner@linux.ibm.com>
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>

elf: Enable TLS descriptor tests on aarch64

The aarch64 uses 'trad' for traditional tls and 'desc' for tls
descriptors, but unlike other targets it defaults to 'desc'. The
gnutls2 configure check does not set aarch64 as an ABI that uses
TLS descriptors, which then disable somes stests.

Also rename the internal machinery fron gnu2 to tls descriptors.

Checked on aarch64-linux-gnu.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

arm: Update _dl_tlsdesc_dynamic to preserve caller-saved registers (BZ 31372)

ARM _dl_tlsdesc_dynamic slow path has two issues:

  * The ip/r12 is defined by AAPCS as a scratch register, and gcc is
    used to save the stack pointer before on some function calls.  So it
    should also be saved/restored as well.  It fixes the tst-gnu2-tls2.

  * None of the possible VFP registers are saved/restored.  ARM has the
    additional complexity to have different VFP bank sizes (depending of
    VFP support by the chip).

The tst-gnu2-tls2 test is extended to check for VFP registers, although
only for hardfp builds.  Different than setcontext, _dl_tlsdesc_dynamic
does not have  HWCAP_ARM_IWMMXT (I don't have a way to properly test
it and it is almost a decade since newer hardware was released).

With this patch there is no need to mark tst-gnu2-tls2 as XFAIL.

Checked on arm-linux-gnueabihf.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

Ignore undefined symbols for -mtls-dialect=gnu2

So it does not fail for arm config that defaults to -mtp=soft (which
issues a call to __aeabi_read_tp).
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

Add tst-gnu2-tls2mod1 to test-internal-extras

That allows sysdeps/x86_64/tst-gnu2-tls2mod1.S to use internal headers.

Fixes: 717ebfa85c ("x86-64: Allocate state buffer space for RDI, RSI and RBX")

x86-64: Allocate state buffer space for RDI, RSI and RBX

_dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack.
After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11.  Define
TLSDESC_CALL_REGISTER_SAVE_AREA to allocate space for RDI, RSI and RBX
to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to
STATE_SAVE_OFFSET(%rsp).

   +==================+<- stack frame start aligned at 8 or 16 bytes
   |                  |<- RDI saved in the red zone
   |                  |<- RSI saved in the red zone
   |                  |<- RBX saved in the red zone
   |                  |<- paddings for stack realignment of 64 bytes
   |------------------|<- xsave buffer end aligned at 64 bytes
   |                  |<-
   |                  |<-
   |                  |<-
   |------------------|<- xsave buffer start at STATE_SAVE_OFFSET(%rsp)
   |                  |<- 8-byte padding for 64-byte alignment
   |                  |<- 8-byte padding for 64-byte alignment
   |                  |<- R11
   |                  |<- R10
   |                  |<- R9
   |                  |<- R8
   |                  |<- RDX
   |                  |<- RCX
   +==================+<- RSP aligned at 64 bytes

Define TLSDESC_CALL_REGISTER_SAVE_AREA, the total register save area size
for all integer registers by adding 24 to STATE_SAVE_OFFSET since RDI, RSI
and RBX are saved onto stack without adjusting stack pointer first, using
the red-zone.  This fixes BZ #31501.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>

riscv: Update nofpu libm test ulps

Fix two test failures.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Add STATX_MNT_ID_UNIQUE from Linux 6.8 to bits/statx-generic.h

Linux 6.8 adds a new STATX_MNT_ID_UNIQUE constant. Add it to glibc's
bits/statx-generic.h.

Tested for x86_64.

linux: Use rseq area unconditionally in sched_getcpu (bug 31479)

Originally, nptl/descr.h included <sys/rseq.h>, but we removed that
in commit 2c6b4b272e6b4d07303af25709051c3e96288f2d ("nptl:
Unconditionally use a 32-byte rseq area").  After that, it was
not ensured that the RSEQ_SIG macro was defined during sched_getcpu.c
compilation that provided a definition.  This commit always checks
the rseq area for CPU number information before using the other
approaches.

This adds an unnecessary (but well-predictable) branch on
architectures which do not define RSEQ_SIG, but its cost is small
compared to the system call.  Most architectures that have vDSO
acceleration for getcpu also have rseq support.

Fixes: 2c6b4b272e6b4d07303af25709051c3e96288f2d
Fixes: 1d350aa06091211863e41169729cee1bca39f72f
Reviewed-by: Arjun Shankar <arjun@redhat.com>

aarch64: fix check for SVE support in assembler

Due to GCC bug 110901 -mcpu can override -march setting when compiling
asm code and thus a compiler targetting a specific cpu can fail the
configure check even when binutils gas supports SVE.

The workaround is that explicit .arch directive overrides both -mcpu
and -march, and since that's what the actual SVE memcpy uses the
configure check should use that too even if the GCC issue is fixed
independently.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Update kernel version to 6.8 in header constant tests

This patch updates the kernel version in the tests tst-mman-consts.py,
tst-mount-consts.py and tst-pidfd-consts.py to 6.8. (There are no new
constants covered by these tests in 6.8 that need any other header
changes.)

Tested with build-many-glibcs.py.

Update syscall lists for Linux 6.8

Linux 6.8 adds five new syscalls. Update syscall-names.list and
regenerate the arch-syscall.h headers with build-many-glibcs.py
update-syscalls.

Tested with build-many-glibcs.py.

Use Linux 6.8 in build-many-glibcs.py

This patch makes build-many-glibcs.py use Linux 6.8.

Tested with build-many-glibcs.py (host-libraries, compilers and glibcs
builds).

powerpc: Remove power8 strcasestr optimization

Similar to strstr (1e9a550ba4), power8 strcasestr does not show much
improvement compared to the generic implementation.  The geomean
on bench-strcasestr shows:

            __strcasestr_power8  __strcasestr_ppc
  power10                  1159              1120
  power9                   1640              1469
  power8                   1787              1904

The strcasestr uses the same 'trick' as power7 strstr to detect
potential quadradic behavior, which only adds overheads for input
that trigger quadradic behavior and it is really a hack.

Checked on powerpc64le-linux-gnu.
Reviewed-by: DJ Delorie <dj@redhat.com>

riscv: Fix alignment-ignorant memcpy implementation

The memcpy optimization (commit 587a1290a1af7bee6db) has a series
of mistakes:

  - The implementation is wrong: the chunk size calculation is wrong
    leading to invalid memory access.

  - It adds ifunc supports as default, so --disable-multi-arch does
    not work as expected for riscv.

  - It mixes Linux files (memcpy ifunc selection which requires the
    vDSO/syscall mechanism)  with generic support (the memcpy
    optimization itself).

  - There is no __libc_ifunc_impl_list, which makes testing only
    check the selected implementation instead of all supported
    by the system.

This patch also simplifies the required bits to enable ifunc: there
is no need to memcopy.h; nor to add Linux-specific files.

The __memcpy_noalignment tail handling now uses a branchless strategy
similar to aarch64 (overlap 32-bits copies for sizes 4..7 and byte
copies for size 1..3).

Checked on riscv64 and riscv32 by explicitly enabling the function
on __libc_ifunc_impl_list on qemu-system.

Changes from v1:
* Implement the memcpy in assembly to correctly handle RISCV
  strict-alignment.
Reviewed-by: Evan Green <evan@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>

linux/sigsetops: fix type confusion (bug 31468)

Each mask in the sigset array is an unsigned long, so fix __sigisemptyset
to use that instead of int. The __sigword function returns a simple array
index, so it can return int instead of unsigned long.

LoongArch: Correct {__ieee754, _}_scalb -> {__ieee754, _}_scalbf

duplocale: protect use of global locale (bug 23970)

Protect the global locale from being modified while we compute the size of
the locale category names. That allows the use of the global locale in a
single thread, while all other threads use the thread safe locale
functions.

x86-64: Simplify minimum ISA check ifdef conditional with if

Replace minimum ISA check ifdef conditional with if. Since
MINIMUM_X86_ISA_LEVEL and AVX_X86_ISA_LEVEL are compile time constants,
compiler will perform constant folding optimization, getting same
results.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

manual/tunables - Add entry for enable_secure tunable.

NEWS: Move enable_secure_tunable from 2.39 to 2.40.

riscv: Add and use alignment-ignorant memcpy

For CPU implementations that can perform unaligned accesses with little
or no performance penalty, create a memcpy implementation that does not
bother aligning buffers. It will use a block of integer registers, a
single integer register, and fall back to bytewise copy for the
remainder.

Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

riscv: Add ifunc helper method to hwprobe.h

Add a little helper method so it's easier to fetch a single value from
the hwprobe function when used within an ifunc selector.

Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

riscv: Enable multi-arg ifunc resolvers

RISC-V is apparently the first architecture to pass more than one
argument to ifunc resolvers. The helper macros in libc-symbols.h,
__ifunc_resolver(), __ifunc(), and __ifunc_hidden(), are incompatible
with this. These macros have an "arg" (non-final) parameter that
represents the parameter signature of the ifunc resolver. The result is
an inability to pass the required comma through in a single preprocessor
argument.

Rearrange the __ifunc_resolver() macro to be variadic, and pass the
types as those variable parameters. Move the guts of __ifunc() and
__ifunc_hidden() into new macros, __ifunc_args(), and
__ifunc_args_hidden(), that pass the variable arguments down through to
__ifunc_resolver(). Then redefine __ifunc() and __ifunc_hidden(), which
are used in a bunch of places, to simply shuffle the arguments down into
__ifunc_args[_hidden]. Finally, define a riscv-ifunc.h header, which
provides convenience macros to those looking to write ifunc selectors
that use both arguments.

Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

riscv: Add __riscv_hwprobe pointer to ifunc calls

The new __riscv_hwprobe() function is designed to be used by ifunc
selector functions. This presents a challenge for applications and
libraries, as ifunc selectors are invoked before all relocations have
been performed, so an external call to __riscv_hwprobe() from an ifunc
selector won't work. To address this, pass a pointer to the
__riscv_hwprobe() function into ifunc selectors as the second
argument (alongside dl_hwcap, which was already being passed).

Include a typedef as well for convenience, so that ifunc users don't
have to go through contortions to call this routine. Users will need to
remember to check the second argument for NULL, to account for older
glibcs that don't pass the function.

Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

riscv: Add hwprobe vdso call support

The new riscv_hwprobe syscall also comes with a vDSO for faster answers
to your most common questions. Call in today to speak with a kernel
representative near you!

Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

linux: Introduce INTERNAL_VSYSCALL

Add an INTERNAL_VSYSCALL() macro that makes a vDSO call, falling back to
a regular syscall, but without setting errno. Instead, the return value
is plumbed straight out of the macro.

Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

riscv: Add Linux hwprobe syscall support

Add awareness and a thin wrapper function around a new Linux system call
that allows callers to get architecture and microarchitecture
information about the CPUs from the kernel. This can be used to
do things like dynamically choose a memcpy implementation.

Signed-off-by: Evan Green <evan@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>

rtld: Add glibc.rtld.enable_secure tunable.

Add a tunable for setting __libc_enable_secure to 1.  Do not set
__libc_enable_secure to 0 if the tunable is set to 0.  Ignore all
tunables if glib.rtld.enable_secure is set.  One use-case for this
addition is to enable testing code paths that depend on
__libc_enable_secure being set without the need to use setxid binaries.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

x86-64: Update _dl_tlsdesc_dynamic to preserve AMX registers

_dl_tlsdesc_dynamic should also preserve AMX registers which are
caller-saved.  Add X86_XSTATE_TILECFG_ID and X86_XSTATE_TILEDATA_ID
to x86-64 TLSDESC_CALL_STATE_SAVE_MASK.  Compute the AMX state size
and save it in xsave_state_full_size which is only used by
_dl_tlsdesc_dynamic_xsave and _dl_tlsdesc_dynamic_xsavec.  This fixes
the AMX part of BZ #31372.  Tested on AMX processor.

AMX test is enabled only for compilers with the fix for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

GCC 14 and GCC 11/12/13 branches have the bug fix.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>