Public Git Hosting - official-gcc.git/log

c++: Fix constexpr evaluation of parameters passed by invisible reference [PR111284]

My r9-6136 changes to make a copy of constexpr function bodies before
genericization modifies it broke the constant evaluation of non-POD
arguments passed by value.
In the callers such arguments are passed as reference to usually a
TARGET_EXPR, but on the callee side until genericization they are just
direct uses of a PARM_DECL with some class type.
In cxx_bind_parameters_in_call I've used convert_from_reference to
pretend it is passed by value and then cxx_eval_constant_expression
is called there and evaluates that as an rvalue, followed by
adjust_temp_type if the types don't match exactly (e.g. const Foo
argument and passing to it reference to Foo TARGET_EXPR).

The reason this doesn't work is that when the TARGET_EXPR in the caller
is constant initialized, this for it is the address of the TARGET_EXPR_SLOT,
but if the code later on pretends the PARM_DECL is just initialized to the
rvalue of the constant evaluation of the TARGET_EXPR, it is as if there
is a bitwise copy of the TARGET_EXPR to the callee, so this in the callee
is then address of the PARM_DECL in the callee.

The following patch attempts to fix that by constexpr evaluation of such
arguments in the caller as an lvalue instead of rvalue, and on the callee
side when seeing such a PARM_DECL, if we want an lvalue, lookup the value
(lvalue) saved in ctx->globals (if any), and if wanting an rvalue,
recursing with vc_prvalue on the looked up value (because it is there
as an lvalue, nor rvalue).

adjust_temp_type doesn't work for lvalues of non-scalarish types, for
such types it relies on changing the type of a CONSTRUCTOR, but on the
other side we know what we pass to the argument is addressable, so
the patch on type mismatch takes address of the argument value, casts
to reference to the desired type and dereferences it.

2024-04-25 Jakub Jelinek <jakub@redhat.com>

PR c++/111284
* constexpr.cc (cxx_bind_parameters_in_call): For PARM_DECLs with
TREE_ADDRESSABLE types use vc_glvalue rather than vc_prvalue for
cxx_eval_constant_expression and if it doesn't have the same
type as it should, cast the reference type to reference to type
before convert_from_reference and instead of adjust_temp_type
take address of the arg, cast to reference to type and then
convert_from_reference.
(cxx_eval_constant_expression) <case PARM_DECL>: For lval case
on parameters with TREE_ADDRESSABLE types lookup result in
ctx->globals if possible. Otherwise if lookup in ctx->globals
was successful for parameter with TREE_ADDRESSABLE type,
recurse with vc_prvalue on the returned value.

* g++.dg/cpp1z/constexpr-111284.C: New test.
* g++.dg/cpp1y/constexpr-lifetime7.C: Expect one error on a different
line.

libgcc: Don't use weakrefs for glibc 2.34

glibc 2.34 and later doesn't have separate libpthread (libpthread.so.0 is a
dummy shared library with just some symbol versions for compatibility, but
all the pthread_* APIs are in libc.so.6).
So, we don't need to do the .weakref dances to check whether a program
has been linked with -lpthread or not, in dynamically linked apps those
will be always true anyway.
In -static linking, this fixes various issues people had when only linking
some parts of libpthread.a and getting weird crashes. A hack for that was
what e.g. some Fedora glibcs used, where libpthread.a was a library
containing just one giant *.o file which had all the normal libpthread.a
*.o files linked with -r together.

libstdc++-v3 actually does something like this already since r10-10928,
the following patch is meant to fix it even for libgfortran, libobjc and
whatever else uses gthr.h.

2024-04-25 Jakub Jelinek <jakub@redhat.com>

* gthr.h (GTHREAD_USE_WEAK): Redefine to 0 for GLIBC 2.34 or later.

c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]

When expand_or_defer_fn is called at_eof time, it calls import_export_decl
and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a
couple of places to try to optimize cdtors which are known to have the
same body by making the complete cdtor an alias to base cdtor (and in
that case also uses *[CD]5* as comdat group name instead of the normal
comdat group names specific to each mangled name).
Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN,
maybe_clone_body and can_alias_cdtor use:
      if (DECL_ONE_ONLY (fn))
        cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone));
...
  bool can_alias = can_alias_cdtor (fn);
...
      /* Tell cgraph if both ctors or both dtors are known to have
         the same body.  */
      if (can_alias
          && fns[0]
          && idx == 1
          && cgraph_node::get_create (fns[0])->create_same_body_alias
               (clone, fns[0]))
        {
          alias = true;
          if (DECL_ONE_ONLY (fns[0]))
            {
              /* For comdat base and complete cdtors put them
                 into the same, *[CD]5* comdat group instead of
                 *[CD][12]*.  */
              comdat_group = cdtor_comdat_group (fns[1], fns[0]);
              cgraph_node::get_create (fns[0])->set_comdat_group (comdat_group);
              if (symtab_node::get (clone)->same_comdat_group)
                symtab_node::get (clone)->remove_from_same_comdat_group ();
              symtab_node::get (clone)->add_to_same_comdat_group
                (symtab_node::get (fns[0]));
            }
        }
and
  /* Don't use aliases for weak/linkonce definitions unless we can put both
     symbols in the same COMDAT group.  */
  return (DECL_INTERFACE_KNOWN (fn)
          && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
          && (!DECL_ONE_ONLY (fn)
              || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
The following testcase regressed with Marek's r14-5979 change,
when pr113208_0.C is compiled where the ctor is marked constexpr,
we no longer perform this optimization, where
_ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
_ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
_ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
_ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
_ZN6vectorI12QualityValueEC1ERKS1_ comdat group.
Now, the linker seems to somehow cope with that, eventhough it
probably keeps both copies of the ctor, but seems LTO can't cope
with that and Honza doesn't know what it should do in that case
(linker decides that the prevailing symbol is
_ZN6vectorI12QualityValueEC2ERKS1_ (from the
_ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and
_ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU,
from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)).

Note, the case where some constructor is marked constexpr in one
TU and not in another one happens pretty often in libstdc++ when
one mixes -std= flags used to compile different compilation units.

The reason the optimization doesn't trigger when the constructor is
constexpr is that expand_or_defer_fn is called in that case much earlier
than when it is not constexpr; in the former case it is called when we
try to constant evaluate that constructor.  But DECL_INTERFACE_KNOWN
is false in that case and comdat_linkage hasn't been called either
(I think it is desirable, because comdat group is stored in the cgraph
node and am not sure it is a good idea to create cgraph nodes for
something that might not be needed later on at all), so maybe_clone_body
clones the bodies, but doesn't make them as aliases.

The following patch is an attempt to redo this optimization when
import_export_decl is called at_eof time on the base/complete cdtor
(or deleting dtor).  It will not do anything if maybe_clone_body
hasn't been called uyet (the TREE_ASM_WRITTEN check on the
DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete
cdtors have been lowered already, or when maybe_clone_body called
maybe_thunk_body and it was successful.  Otherwise retries the
can_alias_cdtor check and makes the complete cdtor alias to the
base cdtor with adjustments to the comdat group.

2024-04-25  Jakub Jelinek  <jakub@redhat.com>

PR lto/113208
* cp-tree.h (maybe_optimize_cdtor): Declare.
* decl2.cc (import_export_decl): Call it for cloned cdtors.
* optimize.cc (maybe_optimize_cdtor): New function.

* g++.dg/abi/comdat2.C: New test.
* g++.dg/abi/comdat5.C: New test.
* g++.dg/lto/pr113208_0.C: New test.
* g++.dg/lto/pr113208_1.C: New file.
* g++.dg/lto/pr113208.h: New file.

bpf: avoid issues with CO-RE and -gtoggle

Compiling a BPF program with CO-RE relocations (and BTF) while also
passing -gtoggle led to an inconsistent state where CO-RE support was
enabled but BTF would not be generated, and this was not caught by the
existing option parsing. This led to an ICE when generating the CO-RE
relocation info, since BTF is required for CO-RE.

Update bpf_option_override to avoid this case, and add a few tests for
the interactions of these options.

gcc/
* config/bpf/bpf.cc (bpf_option_override): Improve handling of CO-RE
options to avoid issues with -gtoggle.

gcc/testsuite/
* gcc.target/bpf/core-options-1.c: New test.
* gcc.target/bpf/core-options-2.c: Likewise.
* gcc.target/bpf/core-options-3.c: Likewise.

openmp: Copy DECL_LANG_SPECIFIC and DECL_LANG_FLAG_? to tree-nested decl copy [PR114825]

tree-nested.cc creates in 2 spots artificial VAR_DECLs, one of them is used
both for debug info and OpenMP/OpenACC lowering purposes, the other solely for
OpenMP/OpenACC lowering purposes.
When the decls are used in OpenMP/OpenACC lowering, the OMP langhooks (mostly
Fortran, C just a little and C++ doesn't have nested functions) then inspect
the flags on the vars and based on that decide how to lower the corresponding
clauses.

Unfortunately we weren't copying DECL_LANG_SPECIFIC and DECL_LANG_FLAG_?, so
the langhooks made decisions on the default flags on those instead.
As the original decl isn't necessarily a VAR_DECL, could be e.g. PARM_DECL,
using copy_node wouldn't work properly, so this patch just copies those
flags in addition to other flags it was copying already. And I've removed
code duplication by introducing a helper function which does copying common
to both uses.

2024-04-25 Jakub Jelinek <jakub@redhat.com>

PR fortran/114825
* tree-nested.cc (get_debug_decl): New function.
(get_nonlocal_debug_decl): Use it.
(get_local_debug_decl): Likewise.

* gfortran.dg/gomp/pr114825.f90: New test.

libstdc++: Rename man pages to use '::' instead of '_'

The Doxygen-generated man pages for some new types need to be renamed to
use '::' instead of '_' in the filenames.

libstdc++-v3/ChangeLog:

* scripts/run_doxygen: Rename man pages for nested types.

libstdc++: Fix typo in Doxygen comment

libstdc++-v3/ChangeLog:

* include/std/chrono (tzdb_list): Fix typo in Doxygen comment.

libstdc++: Fix run_doxygen for Doxygen 1.10 man page format

Doxygen switched from \fC to \fR in its man page output:
https://github.com/doxygen/doxygen/pull/10497

This breaks our script that expects \fC so change the regaulr expression
to work with either style.

libstdc++-v3/ChangeLog:

* scripts/run_doxygen: Adjust sed pattern to match '\fR' for
new man output that Doxygen 1.10 generates.

libstdc++: Update Doxygen config for new headers

libstdc++-v3/ChangeLog:

* doc/doxygen/stdheader.cc (init_map): Add missing headers.
* doc/doxygen/user.cfg.in (EXCLUDE): Exclude generated files for
std::format and std::text_encoding.

libstdc++: Add comment to #include in <variant>

It's not obvious why <variant> needs <bits/parse_numbers.h> so add a
comment to it.

libstdc++-v3/ChangeLog:

* include/std/variant: Add comment to #include.

PR modula2/114836 Avoid concatenation of error strings to aid error locale translation

This patch avoids a concatenation of error strings making locale
translation of the error message easier.

gcc/m2/ChangeLog:

PR modula2/114836
* gm2-compiler/M2Range.mod (FoldTypeAssign): Avoid error
string concatenation.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

bpf: default to using pseudo-C assembly syntax by default

At this point the kernel headers that almost all BPF programs use
contain pseudo-C inline assembly and having the GNU toolchain using
the conventional assembly syntax by default would force users to
specify the command-line option explicitly almost all of the time,
which is very inconvenient.

This patch changes GCC in order to recognize and generate the pseudo-C
assembly syntax of BPF by default. The ASM_SPEC is adapted
accordingly, and in a way that the current release of the BPF
assembler (which still expects conventional assembler syntax by
default) does the right thing.

Tested in bpf-unknown-none-bpf target and x86_64-linux-gnu host.
No regressions.

gcc/ChangeLog

* config/bpf/bpf.opt: Use ASM_PSEUDOC for the default value of
-masm.
* config/bpf/bpf.h (ASM_SPEC): Adapt accordingly.
* doc/invoke.texi (eBPF Options): Update.

gcc/testsuite/ChangeLog

* gcc.target/bpf/alu-1.c: Specify conventional asm dialect.
* gcc.target/bpf/xbpf-indirect-call-1.c: Likewise.
* gcc.target/bpf/sync-fetch-and-add.c: Likewise.
* gcc.target/bpf/smov-2.c: Likewise.
* gcc.target/bpf/smov-1.c: Likewise.
* gcc.target/bpf/smod-1.c: Likewise.
* gcc.target/bpf/sload-1.c: Likewise.
* gcc.target/bpf/sdiv-1.c: Likewise.
* gcc.target/bpf/nop-1.c: Likewise.
* gcc.target/bpf/neg-1.c: Likewise.
* gcc.target/bpf/ldxdw.c: Likewise.
* gcc.target/bpf/jmp-1.c: Likewise.
* gcc.target/bpf/inline-memops-threshold-1.c: Likewise.
* gcc.target/bpf/float-1.c: Likewise.
* gcc.target/bpf/double-2.c: Likewise.
* gcc.target/bpf/double-1.c: Likewise.
* gcc.target/bpf/core-builtin-type-id.c: Likewise.
* gcc.target/bpf/core-builtin-type-based.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-size-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-sign-2.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-sign-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-rshift-2.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-rshift-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-lshift-2.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-lshift-1-le.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-lshift-1-be.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-existence-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-errors-2.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-errors-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-const-elimination.c:
Likewise.
* gcc.target/bpf/core-builtin-exprlist-4.c: Likewise.
* gcc.target/bpf/core-builtin-exprlist-3.c: Likewise.
* gcc.target/bpf/core-builtin-exprlist-2.c: Likewise.
* gcc.target/bpf/core-builtin-exprlist-1.c: Likewise.
* gcc.target/bpf/core-builtin-enumvalue-opt.c: Likewise.
* gcc.target/bpf/core-builtin-enumvalue-errors.c: Likewise.
* gcc.target/bpf/core-builtin-enumvalue.c: Likewise.
* gcc.target/bpf/core-builtin-3.c: Likewise.
* gcc.target/bpf/core-builtin-2.c: Likewise.
* gcc.target/bpf/core-builtin-1.c: Likewise.
* gcc.target/bpf/core-attr-struct-as-array.c: Likewise.
* gcc.target/bpf/core-attr-6.c: Likewise.
* gcc.target/bpf/core-attr-5.c: Likewise.
* gcc.target/bpf/core-attr-4.c: Likewise.
* gcc.target/bpf/core-attr-3.c: Likewise.
* gcc.target/bpf/core-attr-2.c: Likewise.
* gcc.target/bpf/core-attr-1.c: Likewise.
* gcc.target/bpf/builtin-load.c: Likewise.
* gcc.target/bpf/btfext-funcinfo-nocore.c: Likewise.
* gcc.target/bpf/btfext-funcinfo.c: Likewise.
* gcc.target/bpf/bswap-1.c: Likewise.
* gcc.target/bpf/bswap-2.c: Likewise.
* gcc.target/bpf/attr-kernel-helper.c: Likewise.
* gcc.target/bpf/atomic-xchg-2.c: Likewise.
* gcc.target/bpf/atomic-xchg-1.c: Likewise.
* gcc.target/bpf/atomic-op-3.c: Likewise.
* gcc.target/bpf/atomic-op-2.c: Likewise.
* gcc.target/bpf/atomic-op-1.c: Likewise.
* gcc.target/bpf/atomic-fetch-op-3.c: Likewise.
* gcc.target/bpf/atomic-fetch-op-2.c: Likewise.
* gcc.target/bpf/atomic-fetch-op-1.c: Likewise.
* gcc.target/bpf/atomic-cmpxchg-2.c: Likewise.
* gcc.target/bpf/atomic-cmpxchg-1.c: Likewise.
* gcc.target/bpf/alu-2.c: Likewise.

arm: Zero/Sign extends for CMSE security

Co-Authored by: Andre Simoes Dias Vieira <Andre.SimoesDiasVieira@arm.com>

This patch makes the following changes:

1) When calling a secure function from non-secure code then any arguments
smaller than 32-bits that are passed in registers are zero- or sign-extended.
2) After a non-secure function returns into secure code then any return value
smaller than 32-bits that is passed in a register is zero- or sign-extended.

This patch addresses the following CVE-2024-0151.

gcc/ChangeLog:
PR target/114837
* config/arm/arm.cc (cmse_nonsecure_call_inline_register_clear):
Add zero/sign extend.
(arm_expand_prologue): Add zero/sign extend.

gcc/testsuite/ChangeLog:

* gcc.target/arm/cmse/extend-param.c: New test.
* gcc.target/arm/cmse/extend-return.c: New test.

modula2: issue the parameter incompatibility error message based on dialect

This tiny patch improves the parameter incompatibility error message by
having a different message for the dialect chosen mentioning the specific
violation. PIM uses assignment rules for pass by value and expression
rules for pass by reference. ISO uses expression type checking for
pass by value and pass by reference.

gcc/m2/ChangeLog:

* gm2-compiler/M2FileName.def (CalculateFileName): Remove
quoted string in comment.
* gm2-compiler/M2Range.mod (FoldTypeParam): Generate dialect
specific parameter incompatibility error message.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

tree-optimization/114792 - order loops to unloops in CH

When we use unloop_loops we have to make sure to have loops ordered
inner to outer as otherwise we can wreck inner loop structure where
unlooping relies on that being intact. The following re-sorts the
vector of to unloop loops after copy-header as that adds to the
vector in two places and the wrong order.

PR tree-optimization/114792
* tree-ssa-loop-ch.cc (ch_order_loops): New function.
(ch_base::copy_headers): Sort loops to unloop inner-to-outer.

* gcc.dg/torture/pr114792.c: New testcase.

Fix calling convention incompatibility with vendor compiler

For the 20th anniversary of https://gcc.gnu.org/gcc-3.4/sparc-abi.html,
a new calling convention incompatibility with the vendor compiler (and
the ABI) has been discovered in 64-bit mode, affecting small structures
containing arrays of floating-point components. The decision has been
made to fix it on Solaris only at this point.

gcc/
PR target/114416
* config/sparc/sparc.h (SUN_V9_ABI_COMPATIBILITY): New macro.
* config/sparc/sol2.h (SUN_V9_ABI_COMPATIBILITY): Redefine it.
* config/sparc/sparc.cc (fp_type_for_abi): New predicate.
(traverse_record_type): Use it to spot floating-point types.
(compute_fp_layout): Also deal with array types.

gcc/testsuite/
* gcc.target/sparc/small-struct-1.c: New test.
* gcc.target/sparc/pr105573.c: Rename to...
* gcc.target/sparc/20230425-1.c: ...this.
* gcc.target/sparc/pr109541.c: Rename to...
* gcc.target/sparc/20230607-1.c: ...this

RISC-V: Add test cases for insn does not satisfy its constraints [PR114714]

We have one ICE when RVV register overlap is enabled. We reverted this
feature as it is in stage 4 and there is no much time to figure a better
solution for this. Thus, for now add the related test cases which will
trigger ICE when register overlap enabled.

This will gate the RVV register overlap support in GCC-15.

PR target/114714

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr114714-1.C: New test.
* g++.target/riscv/rvv/base/pr114714-2.C: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>

RISC-V: Add early clobber to the dest of vwsll

We missed the existing early clobber for the dest operand of vwsll
pattern when resolve the conflict of revert register overlap. Thus
add it back to the pattern. Unfortunately, we have no test to cover
this part and will improve this after GCC-15 open.

The below tests are passed for this patch:
* The rv64gcv fully regression test with isl build.

gcc/ChangeLog:

* config/riscv/vector-crypto.md: Add early clobber to the
dest operand of vwsll.

Signed-off-by: Pan Li <pan2.li@intel.com>

Fortran: Fix ICE in gfc_trans_create_temp_array from bad type [PR93678]

2024-04-25 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/93678
* trans-expr.cc (gfc_conv_procedure_call): Use the interface,
where possible, to obtain the type of character procedure
pointers of class entities.

gcc/testsuite/
PR fortran/93678
* gfortran.dg/pr93678.f90: New test.

Fortran: Generate new charlens for shared symbol typespecs [PR89462]

2024-04-25 Paul Thomas <pault@gcc.gnu.org>
Jakub Jelinek <jakub@gcc.gnu.org>

gcc/fortran
PR fortran/89462
* decl.cc (build_sym): Add an extra argument 'elem'. If 'elem'
is greater than 1, gfc_new_charlen is called to generate a new
charlen, registered in the symbol namespace.
(variable_decl, enumerator_decl): Set the new argument in the
calls to build_sym.

gcc/testsuite/
PR fortran/89462
* gfortran.dg/pr89462.f90: New test.

rs6000: Use bcdsub. instead of bcdadd. for bcd invalid number checking

bcdadd. might causes overflow which also set the overflow/invalid bit.
bcdsub. doesn't have the issue when do subtracting on two same bcd number.

gcc/
* config/rs6000/altivec.md (*bcdinvalid_<mode>): Replace bcdadd
with bcdsub.
(bcdinvalid_<mode>): Likewise.

gcc/testsuite/
* gcc.target/powerpc/bcd-4.c: Adjust the number of bcdadd and
bcdsub.

RISC-V: Add xfail test case for highpart register overlap of vwcvt

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

bdad036da32 RISC-V: Support highpart register overlap for vwcvt

The below test suites are passed for this patch
* The rv64gcv fully regression test with isl build.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-1.c: New test.
* gcc.target/riscv/rvv/base/pr112431-2.c: New test.
* gcc.target/riscv/rvv/base/pr112431-3.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

c++/modules testsuite: restrict expensive pr99023 test

The pr99023 testcase uses --param=ggc-min-expand=0 which forces a GC
during every collection point and consequently is very slow to run,
and ends up being the main bottleneck of the modules.exp testsuite.

So this patch restricts this test to run once, in C++20 mode, instead of
multiple times (C++17, C++20 and C++23 mode by default). After this
patch the modules.exp testsuite finishes in 3m instead of 3m40s with -j8
on my machine.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr99023_a.X: Run only in C++20 mode.
* g++.dg/modules/pr99023_b.X: Likewise.

Reviewed-by: Jason Merrill <jason@redhat.com>

c++: constexpr union member access folding [PR114709]

The object/offset canonicalization performed in cxx_fold_indirect_ref
is undesirable for union member accesses because it loses information
about the member being accessed which we may later need to diagnose an
inactive-member access. So this patch restricts the canonicalization
accordingly.

PR c++/114709

gcc/cp/ChangeLog:

* constexpr.cc (cxx_fold_indirect_ref): Restrict object/offset
canonicalization to RECORD_TYPE member accesses.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-union8.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

v2: DOCUMENTATION_ROOT_URL vs. release branches [PR114738]

This patch moves the documentation root URL infix for release branches
from get_option_url/make_doc_url to configure, such that only the default
changes and when users specify a custom documentation root URL, they don't
have to add gcc-MAJOR.MINOR.0 subdirectories for release branches.

Tested by checking
../configure --disable-bootstrap --enable-languages=c --disable-multilib
built trunk on
void
foo (int x)
{
__builtin_printf ("%ld\n", x);
}
testcase and looking for the URL in there, then repeating that after
changing gcc/BASE-VER to 14.1.0 and again after changing it to 14.1.1,
plus normal bootstrap/regtest.

2024-04-24 Jakub Jelinek <jakub@redhat.com>

PR other/114738
* opts.cc (get_option_url): Revert 2024-04-17 changes.
* gcc-urlifier.cc: Don't include diagnostic-core.h.
(gcc_urlifier::make_doc_url): Revert 2024-04-17 changes.
* configure.ac (documentation-root-url): On release branches
append gcc-MAJOR.MINOR.0/ to the default DOCUMENTATION_ROOT_URL.
* doc/install.texi (--with-documentation-root-url=): Document
the change of the default.
* configure: Regenerate.

Revert "RISC-V: Support highpart register overlap for vwcvt"

This reverts commit bdad036da32f72b84a96070518e7d75c21706dc2.

bpf: define BPF feature pre-processor macros

This commit makes the BPF backend to define the following macros for
c-family languages:

  __BPF_CPU_VERSION__

    This is a numeric value identifying the version of the BPF "cpu"
    for which GCC is generating code.

  __BPF_FEATURE_ALU32
  __BPF_FEATURE_JMP32
  __BPF_FEATURE_JMP_EXT
  __BPF_FEATURE_BSWAP
  __BPF_FEATURE_SDIV_SMOD
  __BPF_FEATURE_MOVSX
  __BPF_FEATURE_LDSX
  __BPF_FEATURE_GOTOL
  __BPF_FEATURE_ST

    These are defines if the corresponding "feature" is enabled.  The
    features are implicitly enabled by the BPF CPU version enabled,
    and most of them can also be enabled/disabled using
    target-specific -m[no-]FEATURE command line switches.

Note that this patch moves the definition of bpf_target_macros, that
implements TARGET_CPU_CPP_BUILTINS in the BPF backend, to a bpf-c.cc
file.  This is because we are now using facilities from c-family/* and
these features are not available in compilers like lto1.

A couple of tests are also added.
Tested in target bpf-unknown-none-gcc and host x86_64-linux-gnu.
No regressions.

gcc/ChangeLog

* config.gcc: Add bpf-c.o as a target object for C and C++.
* config/bpf/bpf.cc (bpf_target_macros): Move to bpf-c.cc.
* config/bpf/bpf-c.cc: New file.
(bpf_target_macros): Move from bpf.cc and define BPF CPU
feature macros.
* config/bpf/t-bpf: Add rules to build bpf-c.o.

gcc/testsuite/ChangeLog

* gcc.target/bpf/feature-macro-1.c: New test.
* gcc.target/bpf/feature-macro-2.c: Likewise.

tree-optimization/114787 - more careful loop update with CFG cleanup

When CFG cleanup removes a backedge we have to be more careful with
loop update. In particular we need to clear niter info and estimates
and if we remove the last backedge of a loop we have to also mark
it for removal to prevent a following basic block merging to associate
loop info with an unrelated header.

PR tree-optimization/114787
* tree-cfg.cc (remove_edge_and_dominated_blocks): When
removing a loop backedge clear niter info and when removing
the last backedge of a loop mark that loop for removal.

* gcc.dg/torture/pr114787.c: New testcase.

tree-optimization/114832 - wrong dominator info with vect peeling

When we update the dominator of the redirected exit after peeling
we check whether the immediate dominator was the loop header rather
than the exit source when we later want to just update it to the
new source. The following fixes this oversight.

PR tree-optimization/114832
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Fix dominance check.

* gcc.dg/vect/pr114832.c: New testcase.

i386: Fix behavior for both using AVX10.1-256 in options and function attribute

When we are using -mavx10.1-256 in command line and avx10.1-256 in
target attribute together, zmm should never be generated. But current
GCC will generate zmm since it wrongly enables EVEX512 for non-explicitly
set AVX512. This patch will fix that issue.

gcc/ChangeLog:

* config/i386/i386-options.cc (ix86_valid_target_attribute_tree):
Check whether AVX512F is explicitly enabled.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_1-24.c: New test.

RISC-V: Add xfail test case for highpart overlap of vext.vf

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

62685890d88 RISC-V: Support highpart overlap for vext.vf

The below test suites are passed for this patch
* The rv64gcv fully regression test with isl build.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/unop_v_constraint-2.c: Adjust asm
check cond.
* gcc.target/riscv/rvv/base/pr112431-4.c: New test.
* gcc.target/riscv/rvv/base/pr112431-5.c: New test.
* gcc.target/riscv/rvv/base/pr112431-6.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Support highpart overlap for vext.vf"

This reverts commit 62685890d8861b72f812bfe171a20332df08bd49.

c++: Fix ICE with xobj parms and maybe incomplete decl-specifiers

This fixes a null dereference issue when decl_specifiers.type is not yet
provided.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_parameter_declaration): Check if
decl_specifiers.type is null.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-basic7.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

i386: Avoid =&r,r,r andn double-word alternative for ia32 [PR114810]

As discussed in the PR, on ia32 with its 8 GPRs, where 1 is always fixed
and other 2 often are as well having an alternative which needs 3
double-word registers is just too much for RA.
The following patch splits that alternative into two, one with o is used
even on ia32, but one with the 3x r is used just for -m64/-mx32.
Tried to reduce the testcase further, but it wasn't easily possible.

2024-04-23 Jakub Jelinek <jakub@redhat.com>

PR target/114810
* config/i386/i386.md (*andn<dwi>3_doubleword_bmi): Split the =&r,r,ro
alternative into =&r,r,r enabled only for x64 and =&r,r,o.

* g++.target/i386/pr114810.C: New test.

Regenerate gcc.pot

* gcc.pot: Regenerate.

Fortran: check C_SIZEOF on additions from TS29113/F2018 [PR103496]

gcc/testsuite/ChangeLog:

PR fortran/103496
* gfortran.dg/c_sizeof_8.f90: New test.

c++/modules: deduced return type merging [PR114795]

When merging an imported function template specialization with an
existing one, if the existing one has an undeduced return type and the
imported one's is already deduced, we need to propagate the deduced type
since once we install the imported definition we won't get a chance to
deduce it by normal means.

So this patch makes is_matching_decl propagate the deduced return
type alongside our propagation of the exception specification.
Another option would be to propagate it later when installing the
imported definition from read_function_def, but it seems preferable
to do it sooner rather than later.

PR c++/114795

gcc/cp/ChangeLog:

* module.cc (trees_in::is_matching_decl): Propagate deduced
function return type.

gcc/testsuite/ChangeLog:

* g++.dg/modules/auto-4_a.H: New test.
* g++.dg/modules/auto-4_b.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

libbacktrace: test --compress-debug-sections=ARG for each ARG

This should fix a testsuite problem with Solaris ld that supports zlib
but not zlib-gabi.

* configure.ac: Test --compress-debug-sections=zlib-gnu and
--compress-debug-sections=zlib-gabi separately, setting new
automake conditionals.
* Makefile.am (ctestg, ctestg_alloc): Only build if
HAVE_COMPRESSED_DEBUG_ZLIB_GNU.
(ctesta, ctesta_alloc): Only build if
HAVE_COMPRESSED_DEBUG_ZLIB_GABI.
(ctestzstd_alloc): New test if HAVE_COMPRESSED_DEBUG_ZSTD.
* configure, Makefile.in: Regenerate.

testsuite: Adjust testsuite expectations for diagnostic spelling fixes

The nullability-00.m* tests unfortunately check the exact spelling of
the diagnostics I've changed earlier today.

2024-04-23 Jakub Jelinek <jakub@redhat.com>

* objc.dg/attributes/nullability-00.m: Adjust expected diagnostic
spelling: recognised -> recognized.
* obj-c++.dg/attributes/nullability-00.mm: Likewise.

Remove repeated information in -ftree-loop-distribute-patterns doc

We have:

       -ftree-loop-distribute-patterns
           Perform loop distribution of patterns that can be code generated with calls to a library.  This flag is enabled by default at -O2 and higher, and by -fprofile-use and -fauto-profile.

           This pass distributes the initialization loops and generates a call to memset zero.  For example, the loop

...

           and the initialization loop is transformed into a call to memset zero.  This flag is enabled by default at -O3.  It is also enabled by -fprofile-use and -fauto-profile.

Which mentions optimizatoin flags twice and the repeated mention is out of
date, since we enable this option at -O2 as well.

gcc/ChangeLog:

* doc/invoke.texi (-ftree-loop-distribute-patterns): Remove duplicated
sentence about optimization flags implying this.

Further spelling fixes in translatable strings

This addresses the non-Oxford British English vs. US English spelling
nits in translatable strings.

I see various similar cases in m2 and rust FEs where they don't make it into
gcc.pot, guess those would be nice to get fixed too.

2024-04-23 Jakub Jelinek <jakub@redhat.com>

* config/darwin.opt (init): Spelling fix: initialiser -> initializer.
gcc/c-family/
* c-attribs.cc (handle_objc_nullability_attribute): Spelling fix:
recognised -> recognized.
gcc/m2/
* lang.opt (fdef=, fmod=): Spelling fix: recognise -> recognize.

Spelling fixes for translatable strings

I've run aspell on gcc.pot (just quickly skimming, so pressing
I key hundreds of times and just stopping when I catch something that
looks like a misspelling).

I plan to commit this tomorrow as obvious unless somebody finds some
issues in it, you know, I'm not a native English speaker.
Yes, I know favour is valid UK spelling, but we spell the US way I think.
I've left some *ise* -> *ize* cases (recognise, initialise), those
had too many hits, though in translatable strings just 4, so maybe
worth changing too:
msgid "recognise the specified suffix as a definition module filename"
msgid "recognise the specified suffix as implementation and module filenames"
"initialiser for a dylib."
msgid "%qE attribute argument %qE is not recognised"

2024-04-23 Jakub Jelinek <jakub@redhat.com>

* config/epiphany/epiphany.opt (may-round-for-trunc): Spelling fix:
floatig -> floating.
* config/riscv/riscv.opt (mcsr-check): Spelling fix: CRS -> CSR.
* params.opt (-param=ipa-cp-profile-count-base=): Spelling fix:
frequncy -> frequency.
gcc/c-family/
* c.opt (Wstrict-flex-arrays): Spelling fix: inproper -> improper.
gcc/cp/
* parser.cc (cp_parser_using_declaration): Spelling fix: favour
-> favor.
gcc/m2/
* lang.opt (fuse-list=): Spelling fix: finalializations ->
finalizations.

s390: testsuite: Xfail forwprop-4{0,1}.c

The tests fail on s390 since can_vec_perm_const_p fails and therefore
the bit insert/ref survive which r14-3381-g27de9aa152141e aims for.
Strictly speaking, the tests only fail in case the target supports
vectors, i.e., for targets prior z13 or in case of -mesa the emulated
vector operations are optimized out.

Set to xfail and tracked by PR114802.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/forwprop-40.c: Xfail for s390.
* gcc.dg/tree-ssa/forwprop-41.c: Xfail for s390.
* lib/target-supports.exp: Add target check s390_mvx.

Fortran: Check that the ICE does not reappear [PR102597]

2024-04-23 Paul Thomas <pault@gcc.gnu.org>

gcc/testsuite/
PR fortran/102597
* gfortran.dg/pr102597.f90: New test.

tree-optimization/114799 - SLP and patterns

The following plugs a hole with computing whether a SLP node has any
pattern stmts which is important to know when we want to replace it
by a CTOR from external defs.

PR tree-optimization/114799
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Properly
update ->any_pattern when swapping operands.

* gcc.dg/vect/bb-slp-pr114799.c: New testcase.

s390x: Fix vec_xl/vec_xst type aliasing [PR114676]

The requirements of the vec_xl/vec_xst intrinsincs wrt aliasing of the
pointer argument are not really documented. As it turns out, users
are likely to get it wrong. With this patch we let the pointer
argument alias everything in order to make it more robust for users.

gcc/ChangeLog:

PR target/114676
* config/s390/s390-c.cc (s390_expand_overloaded_builtin): Use a
MEM_REF with an addend of type ptr_type_node.

gcc/testsuite/ChangeLog:

PR target/114676
* gcc.target/s390/zvector/pr114676.c: New test.

Suggested-by: Jakub Jelinek <jakub@redhat.com>

c++: Copy over DECL_DISREGARD_INLINE_LIMITS flag to inheriting ctors [PR114784]

The following testcase is rejected with
error: inlining failed in call to 'always_inline' '...': call is unlikely and code size would grow
errors.  The problem is that starting with the r14-2149 change
we try to copy most of the attributes from the inherited to
inheriting ctor, but don't copy associated flags that decl_attributes
sets.

Now, the other clone_attrs user, cp/optimize.cc (maybe_clone_body)
copies over
      DECL_COMDAT (clone) = DECL_COMDAT (fn);
      DECL_WEAK (clone) = DECL_WEAK (fn);
      if (DECL_ONE_ONLY (fn))
        cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group (clone));
      DECL_USE_TEMPLATE (clone) = DECL_USE_TEMPLATE (fn);
      DECL_EXTERNAL (clone) = DECL_EXTERNAL (fn);
      DECL_INTERFACE_KNOWN (clone) = DECL_INTERFACE_KNOWN (fn);
      DECL_NOT_REALLY_EXTERN (clone) = DECL_NOT_REALLY_EXTERN (fn);
      DECL_VISIBILITY (clone) = DECL_VISIBILITY (fn);
      DECL_VISIBILITY_SPECIFIED (clone) = DECL_VISIBILITY_SPECIFIED (fn);
      DECL_DLLIMPORT_P (clone) = DECL_DLLIMPORT_P (fn);
      DECL_DISREGARD_INLINE_LIMITS (clone) = DECL_DISREGARD_INLINE_LIMITS (fn);
The following patch just copies DECL_DISREGARD_INLINE_LIMITS to fix
this exact bug, not really sure which other flags should be copied
and which shouldn't.
Plus there are tons of other flags, some of which might need to be copied
too, some of which might not, perhaps in both places, like:
DECL_UNINLINABLE, maybe DECL_PRESERVE_P, TREE_USED, maybe
DECL_USER_ALIGN/DECL_ALIGN, maybe DECL_WEAK, maybe
DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT, DECL_NO_LIMIT_STACK.
TREE_READONLY, DECL_PURE_P, TREE_THIS_VOLATILE (for const, pure and
noreturn attributes) probably makes no sense, DECL_IS_RETURNS_TWICE neither
(returns_twice ctor?).  What about TREE_NOTHROW?
DECL_FUNCTION_SPECIFIC_OPTIMIZATION, DECL_FUNCTION_SPECIFIC_TARGET...

Anyway, another problem is that if inherited_ctor is a TEMPLATE_DECL, as
also can be seen in the using D<T>::D; case in the testcase, then
DECL_ATTRIBUTES (fn) = clone_attrs (DECL_ATTRIBUTES (inherited_ctor));
attempts to copy the attributes from the TEMPLATE_DECL which doesn't have
them.  The following patch copies them from STRIP_TEMPLATE (inherited_ctor)
which does.  E.g. DECL_DECLARED_CONSTEXPR_P works fine as the macro
itself uses STRIP_TEMPLATE too, but not 100% sure about other macros used
on inherited_ctor earlier.

2024-04-23  Jakub Jelinek  <jakub@redhat.com>

PR c++/114784
* method.cc (implicitly_declare_fn): Call clone_attrs
on DECL_ATTRIBUTES on STRIP_TEMPLATE (inherited_ctor) rather than
inherited_ctor.  Also copy DECL_DISREGARD_INLINE_LIMITS flag from it.

* g++.dg/cpp0x/inh-ctor39.C: New test.

c++: Check if allocation functions are xobj members [PR114078]

A class allocation member function is implicitly 'static' by
[class.free] p3, so cannot have an explicit object parameter.

PR c++/114078

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Check allocation functions for xobj
parameters.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-ops-alloc.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

LoongArch: Define builtin macros for ISA evolutions

Detailed description of these definitions can be found at
https://github.com/loongson/la-toolchain-conventions, which
the LoongArch GCC port aims to conform to.

gcc/ChangeLog:

* config.gcc: Add loongarch-evolution.o.
* config/loongarch/genopts/genstr.sh: Enable generation of
loongarch-evolution.[cc,h].
* config/loongarch/t-loongarch: Likewise.
* config/loongarch/genopts/gen-evolution.awk: New file.
* config/loongarch/genopts/isa-evolution.in: Mark ISA version
of introduction for each ISA evolution feature.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins):
Define builtin macros for enabled ISA evolutions and the ISA
version.
* config/loongarch/loongarch-cpu.cc: Use loongarch-evolution.h.
* config/loongarch/loongarch.h: Likewise.
* config/loongarch/loongarch-cpucfg-map.h: Delete.
* config/loongarch/loongarch-evolution.cc: New file.
* config/loongarch/loongarch-evolution.h: New file.
* config/loongarch/loongarch-opts.h (ISA_HAS_FRECIPE): Define.
(ISA_HAS_DIV32): Likewise.
(ISA_HAS_LAM_BH): Likewise.
(ISA_HAS_LAMCAS): Likewise.
(ISA_HAS_LD_SEQ_SA): Likewise.

LoongArch: Define ISA versions

These ISA versions are defined as -march= parameters and
are recommended for building binaries for distribution.

Detailed description of these definitions can be found at
https://github.com/loongson/la-toolchain-conventions, which
the LoongArch GCC port aims to conform to.

gcc/ChangeLog:

* config.gcc: Make la64v1.0 the default ISA preset of the lp64d ABI.
* config/loongarch/genopts/loongarch-strings: Define la64v1.0, la64v1.1.
* config/loongarch/genopts/loongarch.opt.in: Likewise.
* config/loongarch/loongarch-c.cc (LARCH_CPP_SET_PROCESSOR): Likewise.
(loongarch_cpu_cpp_builtins): Likewise.
* config/loongarch/loongarch-cpu.cc (get_native_prid): Likewise.
(fill_native_cpu_config): Likewise.
* config/loongarch/loongarch-def.cc (array_tune): Likewise.
* config/loongarch/loongarch-def.h: Likewise.
* config/loongarch/loongarch-driver.cc (driver_set_m_parm): Likewise.
(driver_get_normalized_m_opts): Likewise.
* config/loongarch/loongarch-opts.cc (default_tune_for_arch): Likewise.
(TUNE_FOR_ARCH): Likewise.
(arch_str): Likewise.
(loongarch_target_option_override): Likewise.
* config/loongarch/loongarch-opts.h (TARGET_uARCH_LA464): Likewise.
(TARGET_uARCH_LA664): Likewise.
* config/loongarch/loongarch-str.h (STR_CPU_ABI_DEFAULT): Likewise.
(STR_ARCH_ABI_DEFAULT): Likewise.
(STR_TUNE_GENERIC): Likewise.
(STR_ARCH_LA64V1_0): Likewise.
(STR_ARCH_LA64V1_1): Likewise.
* config/loongarch/loongarch.cc (loongarch_cpu_sched_reassociation_width): Likewise.
(loongarch_asm_code_end): Likewise.
* config/loongarch/loongarch.opt: Likewise.
* doc/invoke.texi: Likewise.

RISC-V: Adjust overlap attr after revert d3544cea63d and e65aaf8efe1

After we reverted below 2 commits, the reference to attr need some
adjustment as the group_overlap is no longer available.

* RISC-V: Robostify the W43, W86, W87 constraint enabled attribute
* RISC-V: Rename vconstraint into group_overlap

The below tests are passed for this patch.

* The rv64gcv fully regression tests.

gcc/ChangeLog:

* config/riscv/vector-crypto.md:

Signed-off-by: Pan Li <pan2.li@intel.com>

PR modula2/114811 string set incl ICE bugfix

This patch corrects gm2-torture.exp to recognize an ICE
in the fail case as a negative result. The patch also fixes
FoldBinarySet so that the types are only checked once the operands
have been resolved. Without this patch
gcc/testsuite/gm2/iso/fail/badexpression2.mod would cause an ICE.

gcc/m2/ChangeLog:

PR modula2/114811
* gm2-compiler/M2GenGCC.mod (FoldBinarySet): Add condition
checking to ensure op2 and op3 are fully resolved before
type checking is performed.

gcc/testsuite/ChangeLog:

PR modula2/114811
* lib/gm2-torture.exp: Correct regexp checking for internal
compiler error strings in compiler output.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

libstdc++: Fix conversion of simd to vector builtin

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/114803
* include/experimental/bits/simd_builtin.h
(_SimdBase2::operator __vector_type_t): There is no __builtin()
function in _SimdWrapper, instead use its conversion operator.
* testsuite/experimental/simd/pr114803_vecbuiltin_cvt.cc: New
test.

libstdc++: Silence irrelevant warnings in <experimental/simd>

Avoid
-Wnarrowing in C code;
-Wtautological-compare in unconditional static_assert (necessary for
faking a dependency on a template parameter)

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Ignore -Wnarrowing for
arm_neon.h.
(__int_for_sizeof): Replace tautological compare with checking
for invalid template parameter value.
* include/experimental/bits/simd_builtin.h (__extract_part):
Remove tautological compare by combining two static_assert.

PR modula2/114807 badpointer3.mod causes an ICE

This patch fixes an ICE caused when a constant string
is built and attempted to be passed into a procedure with
an opaque type.

gcc/m2/ChangeLog:

PR modula2/114807
* gm2-compiler/M2Check.mod (checkUnbounded): Remove unused
local variables.
(constCheckMeta): Include check for IsReallyPointer in the
failure case.
* gm2-compiler/M2Quads.mod (MoveWithMode): Remove CopyConstString.
* gm2-compiler/SymbolTable.def (IsHiddenReallyPointer): Export.
* gm2-compiler/SymbolTable.mod (SkipHiddenType): Remove.
(IsReallyPointer): Include IsHiddenReallyPointer test.

gcc/testsuite/ChangeLog:

PR modula2/114807
* gm2/pim/fail/badproctype.mod: Change MYSHORTREAL
to SHORTREAL.
* gm2/pim/fail/badprocbool.mod: New test.
* gm2/pim/fail/badproccard.mod: New test.
* gm2/pim/fail/badprocint.mod: New test.
* gm2/pim/fail/badprocint2.mod: New test.
* gm2/pim/pass/goodproccard2.mod: New test.
* gm2/pim/pass/goodprocint.mod: New test.
* gm2/pim/pass/goodprocint3.mod: New test.
* gm2/pim/run/pass/genconststr.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

libstdc++: Workaround kernel-headers on s390x-linux

We see
FAIL: 17_intro/headers/c++1998/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2011/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2014/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2017/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/headers/c++2020/all_attributes.cc   (test for excess errors)
FAIL: 17_intro/names.cc  -std=gnu++17 (test for excess errors)
on s390x-linux.
The first 5 are due to kernel-headers not using uglified attribute names,
where <asm/types.h> contains
__attribute__((packed, aligned(4)))
I've filed a downstream bugreport for this in
https://bugzilla.redhat.com/show_bug.cgi?id=2276084
(not really sure where to report kernel-headers issues upstream), while the
last one is due to <sys/ucontext.h> from glibc containing:
  #ifdef __USE_MISC
  # define __ctx(fld) fld
  #else
  # define __ctx(fld) __ ## fld
  #endif
  ...
  typedef union
    {
      double  __ctx(d);
      float   __ctx(f);
    } fpreg_t;
and g++ predefining -D_GNU_SOURCE which implies define __USE_MISC.

The following patch adds a workaround for this on the libstdc++ testsuite
side.

2024-04-22  Jakub Jelinek  <jakub@redhat.com>

* testsuite/17_intro/names.cc (d, f): Undefine on s390*-linux*.
* testsuite/17_intro/headers/c++1998/all_attributes.cc (packed): Don't
define on s390.
* testsuite/17_intro/headers/c++2011/all_attributes.cc (packed):
Likewise.
* testsuite/17_intro/headers/c++2014/all_attributes.cc (packed):
Likewise.
* testsuite/17_intro/headers/c++2017/all_attributes.cc (packed):
Likewise.
* testsuite/17_intro/headers/c++2020/all_attributes.cc (packed):
Likewise.

testsuite: prune -freport-bug output

When the compiler defaults to -freport-bug, a few dg-ice tests fail
with:

Excess errors:
Preprocessed source stored into /tmp/cc6hldZ0.out file, please attach this to your bugreport.

We could add -fno-report-bug to those tests. But it seems to me that a
better fix would be to prune the "Preprocessed source stored..." message
in prune_gcc_output.

gcc/testsuite/ChangeLog:

* lib/prune.exp (prune_gcc_output): Also prune -freport-bug output.

Reviewed-by: Jakub Jelinek <jakub@redhat.com>

Revert "RISC-V: Rename vconstraint into group_overlap"

This reverts commit e65aaf8efe1900f7bbf76235a078000bf2ec8b45.

Revert "RISC-V: Robostify the W43, W86, W87 constraint enabled attribute"

This reverts commit d3544cea63d0a642b6357a7be55986f5562beaa0.

i386: Fix Sierra Forest auto dispatch

gcc/ChangeLog:

* common/config/i386/i386-common.cc (processor_alias_table):
Let Sierra Forest map to CPU_TYPE enum.

s390x: Do not default to -mvx for -mesa

We currently enable the vector extensions also for -march=z13 -m31
-mesa which is very wrong.

gcc/ChangeLog:

* config/s390/s390.cc (s390_option_override_internal): Check zarch
flag before enabling -mvx.

RISC-V: Add xfail test case for highpart overlap floating-point widen insn

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

8614cbb2534 RISC-V: Support highpart overlap for floating-point widen instructions

The below test suites are passed.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-10.c: New test.
* gcc.target/riscv/rvv/base/pr112431-11.c: New test.
* gcc.target/riscv/rvv/base/pr112431-12.c: New test.
* gcc.target/riscv/rvv/base/pr112431-13.c: New test.
* gcc.target/riscv/rvv/base/pr112431-14.c: New test.
* gcc.target/riscv/rvv/base/pr112431-15.c: New test.
* gcc.target/riscv/rvv/base/pr112431-7.c: New test.
* gcc.target/riscv/rvv/base/pr112431-8.c: New test.
* gcc.target/riscv/rvv/base/pr112431-9.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Support highpart overlap for floating-point widen instructions"

This reverts commit 8614cbb253484e28c3eb20cde4d1067aad56de58.

RISC-V: Add xfail test case for indexed load overlap with SRC EEW < DEST EEW

Update in v2:
* Add change log to pr112431-34.c.

Original log:

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

4418d55bcd1 RISC-V: Support highpart overlap for indexed load with SRC EEW < DEST EEW

The below test suites are passed.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-34.c: Remove xfail for vluxei8 check.
* gcc.target/riscv/rvv/base/pr112431-28.c: New test.
* gcc.target/riscv/rvv/base/pr112431-29.c: New test.
* gcc.target/riscv/rvv/base/pr112431-30.c: New test.
* gcc.target/riscv/rvv/base/pr112431-31.c: New test.
* gcc.target/riscv/rvv/base/pr112431-32.c: New test.
* gcc.target/riscv/rvv/base/pr112431-33.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Support highpart overlap for indexed load with SRC EEW < DEST EEW"

This reverts commit 4418d55bcd1b7e0ef823981b6a781d7de5c38cce.

s390: testsuite: Remove xfail for vpopct{b,h}

Starting with r14-9316-g7890836de20912 patterns for vpopct{b,h} are also
detected. Thus, remove xfails.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vxe/popcount-1.c: Remove xfail.

RISC-V: Add xfail test case for highest-number regno ternary overlap

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

27fde325d64 RISC-V: Support highest-number regno overlap for widen ternary

The below test suites are passed.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-37.c: New test.
* gcc.target/riscv/rvv/base/pr112431-38.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Support highest-number regno overlap for widen ternary"

This reverts commit 27fde325d64447a3a0d5d550c5976e5f3fb6dc16.

RISC-V: Add xfail test case for widening register overlap of vf4/vf8

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

303195e2a6b RISC-V: Support widening register overlap for vf4/vf8

The below test suites are passed.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-16.c: New test.
* gcc.target/riscv/rvv/base/pr112431-17.c: New test.
* gcc.target/riscv/rvv/base/pr112431-18.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Support widening register overlap for vf4/vf8"

This reverts commit 303195e2a6b6f0e8f42e0578b61f9f37c6250beb.

RISC-V: Add xfail test case for highpart register overlap of vx/vf widen

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

a23415d7572 RISC-V: Support highpart register overlap for widen vx/vf instructions

The below test suites are passed.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-22.c: New test.
* gcc.target/riscv/rvv/base/pr112431-23.c: New test.
* gcc.target/riscv/rvv/base/pr112431-24.c: New test.
* gcc.target/riscv/rvv/base/pr112431-25.c: New test.
* gcc.target/riscv/rvv/base/pr112431-26.c: New test.
* gcc.target/riscv/rvv/base/pr112431-27.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Fortran: Detect 'no implicit type' error in right place [PR103471]

2024-04-21 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/103471
* resolve.cc (resolve_actual_arglist): Catch variables silently
set as untyped, resetting the flag so that gfc_resolve_expr can
generate the no implicit type error.
(gfc_resolve_index_1): Block index expressions of unknown type
from being converted to default integer, avoiding the fatal
error in trans-decl.cc.
* symbol.cc (gfc_set_default_type): Remove '(symbol)' from the
'no IMPLICIT type' error message.
* trans-decl.cc (gfc_get_symbol_decl): Change fatal error locus
to that of the symbol declaration.
(gfc_trans_deferred_vars): Remove two trailing tabs.

gcc/testsuite/
PR fortran/103471
* gfortran.dg/pr103471.f90: New test.

AVR: target/114794 - Tweak __udivmodqi4

libgcc/
PR target/114794
* config/avr/lib1funcs.S (__udivmodqi4): Tweak.

Revert "RISC-V: Support highpart register overlap for widen vx/vf instructions"

This reverts commit a23415d7572774701d7ec04664390260ab9a3f63.

RISC-V: Add xfail test case for incorrect overlap on v0

We reverted below patch for register group overlap, add the related
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

018ba3ac952 RISC-V: Fix overlap group incorrect overlap on v0

The below test suites are passed.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-34.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Fix overlap group incorrect overlap on v0"

This reverts commit 018ba3ac952bed4ae01344c060360f13f7cc084a.

PR modula2/112893 full type checking between proctype and procedure not implemented

This patch implements full type checking between proctype and procedures.
The change implements an associated proc type built for each
procedure.  M2Check.mod will request GetProcedureProcType if it encounters
a procedure.  Before this patch a procedure was associated with the type
ADDRESS in the type checking module M2Check.  The
gm2/pim/pass/proccard.mod have been corrected now this assumption has
been removed.

gcc/m2/ChangeLog:

PR modula2/112893
* gm2-compiler/M2Check.mod (GetProcedureProcType): Import.
(getType): Return value using GetProcedureProcType if sym is a
procedure.
* gm2-compiler/M2Range.mod (FoldTypeExpr): Remove quad if
expression is type compatible.
* gm2-compiler/SymbolTable.def (GetProcedureProcType): New
procedure function.
* gm2-compiler/SymbolTable.mod (Procedure): Add ProcedureType.
(MakeProcedure): Initialize ProcedureType.
(PutParam): Call AddProcedureProcTypeParam.
(PutVarParam): Call AddProcedureProcTypeParam.
(AddProcedureProcTypeParam): New procedure.
(GetProcedureProcType): New procedure function.

gcc/testsuite/ChangeLog:

PR modula2/112893
* gm2/pim/pass/another.mod: Correct bug exposed by type checker.
Swap ProcA and ProcB assignments.
* gm2/pim/pass/proccard.mod: Use VAL to convert procedure into a
cardinal.
* gm2/iso/const/fail/castproctype.mod: New test.
* gm2/pim/fail/badproctype.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

RISC-V: Add xfail test case for wv insn highest overlap

We reverted below patch for wv insn overlap, add the related wv
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

7e854b58084 RISC-V: Support highest overlap for wv instructions

The below test suites are passed.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-11.c: Xfail csr check.
* gcc.target/riscv/rvv/base/pr112431-39.c: New test.
* gcc.target/riscv/rvv/base/pr112431-40.c: New test.
* gcc.target/riscv/rvv/base/pr112431-41.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Support highest overlap for wv instructions"

This reverts commit 7e854b58084c131fceca9e8fa9dcc7469972e69d.

RISC-V: Add xfail test case for wv insn register overlap

We reverted below patch for wv insn overlap, add the related wv
insn test and mark it as xfail. And we will remove the xfail
after we support the register overlap in GCC-15.

b3b2799b872 RISC-V: Support one more overlap for wv instructions

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-42.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

Revert "RISC-V: Support one more overlap for wv instructions"

This reverts commit b3b2799b872bc4c1944629af9dfc8472c8ca5fe6.

i386: Fix up *avx2_eq<mode>3 constraints [PR114783]

The r14-4456 change (part of APX EGPR support) seems to have mistakenly
changed in the
@@ -16831,7 +16831,7 @@ (define_insn "*avx2_eq<mode>3"
   [(set (match_operand:VI_256 0 "register_operand" "=x")
        (eq:VI_256
          (match_operand:VI_256 1 "nonimmediate_operand" "%x")
-         (match_operand:VI_256 2 "nonimmediate_operand" "xm")))]
+         (match_operand:VI_256 2 "nonimmediate_operand" "jm")))]
   "TARGET_AVX2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "vpcmpeq<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
hunk the xm constraint to jm, while in many other spots it changed correctly
xm to xjm.  The instruction doesn't require the last operand to be in
memory, it can handle 3 256-bit registers just fine, just it is a VEX only
encoded instruction and so can't allow APX EGPR regs in the memory operand.

The following patch fixes it, so that we don't force one of the == operands
into memory all the time.

2024-04-20  Jakub Jelinek  <jakub@redhat.com>

PR target/114783
* config/i386/sse.md (*avx2_eq<mode>3): Change last operand's
constraint from "jm" to "xjm".

* gcc.target/i386/avx2-pr114783.c: New test.

c-family: Allow arguments with NULLPTR_TYPE as sentinels [PR114780]

While in C++ the ellipsis argument conversions include
"An argument that has type cv std::nullptr_t is converted to type void*"
in C23 a nullptr_t argument is not promoted in any way, but va_arg
description says:
"the type of the next argument is nullptr_t and type is a pointer type that has the same
representation and alignment requirements as a pointer to a character type."
So, while in C++ check_function_sentinel will never see NULLPTR_TYPE, for
C23 it can see that and currently we incorrectly warn about those.

The only question is whether we should warn on any argument with
nullptr_t type or just about nullptr (nullptr_t argument with integer_zerop
value). Through undefined behavior guess one could pass non-NULL pointer
that way, say by union { void *p; nullptr_t q; } u; u.p = &whatever;
and pass u.q to ..., but valid code should always pass something that will
read as (char *) 0 when read using va_arg (ap, char *), so I think it is
better not to warn rather than warn in those cases.

Note, clang seems to pass (void *)0 rather than expression of nullptr_t
type to ellipsis in C23 mode as if it did the C++ ellipsis argument
conversions, in that case guess not warning about that would be even safer,
but what GCC does I think follows the spec more closely, even when in a
valid program one shouldn't be able to observe the difference.

2024-04-20 Jakub Jelinek <jakub@redhat.com>

PR c/114780
* c-common.cc (check_function_sentinel): Allow as sentinel any
argument of NULLPTR_TYPE.

* gcc.dg/format/sentinel-2.c: New test.

c: Fix ICE with -g and -std=c23 related to incomplete types [PR114361]

We did not update TYPE_CANONICAL for incomplete variants when
completing a structure.  We now set for flag_isoc23 TYPE_STRUCTURAL_EQUALITY_P
for incomplete structure and union types and then update TYPE_CANONICAL
later, though update it only for the variants and derived pointer types
which can be easily discovered.  Other derived types created while
the type was still incomplete will remain TYPE_STRUCTURAL_EQUALITY_P.
See PR114574 for discussion.

2024-04-20  Martin Uecker  <uecker@tugraz.at>
    Jakub Jelinek  <jakub@redhat.com>

PR lto/114574
PR c/114361
gcc/c/
* c-decl.cc (shadow_tag_warned): For flag_isoc23 and code not
ENUMERAL_TYPE use SET_TYPE_STRUCTURAL_EQUALITY.
(parser_xref_tag): Likewise.
(start_struct): For flag_isoc23 use SET_TYPE_STRUCTURAL_EQUALITY.
(c_update_type_canonical): New function.
(finish_struct): Put NULL as second == operand rather than first.
Assert TYPE_STRUCTURAL_EQUALITY_P.  Call c_update_type_canonical.
* c-typeck.cc (composite_type_internal): Use
SET_TYPE_STRUCTURAL_EQUALITY.  Formatting fix.
gcc/testsuite/
* gcc.dg/pr114574-1.c: New test.
* gcc.dg/pr114574-2.c: New test.
* gcc.dg/pr114361.c: New test.
* gcc.dg/c23-tag-incomplete-1.c: New test.
* gcc.dg/c23-tag-incomplete-2.c: New test.

libstdc++: Simplify constraints on <=> for std::reference_wrapper

Instead of constraining these overloads in terms of synth-three-way we
can just check that the value_type is less-than-comparable, which is
what synth-three-way's constraints check.

The reason that I implemented these with constraints has now been filed
as LWG 4071, so add a comment about that too.

libstdc++-v3/ChangeLog:

* include/bits/refwrap.h (operator<=>): Simplify constraints.

libstdc++: Support link chains in std::chrono::tzdb::locate_zone [PR114770]

Since 2022 the TZif format defined in the zic(8) man page has said that
links can refer to other links, rather than only referring to a zone.
This isn't supported by the C++20 spec, which assumes that the target()
for a chrono::time_zone_link always names a chrono::time_zone, not
another chrono::time_zone_link.

This hasn't been a problem until now, because there are no entries in
the tzdata file that chain links together. However, Debian Sid has
changed the target of the Asia/Chungking link from the Asia/Shanghai
zone to the Asia/Chongqing link, creating a link chain. The libstdc++
code is unable to handle this, so chrono::locate_zone("Asia/Chungking")
will fail with the tzdata.zi file from Debian Sid.

It seems likely that the C++ spec will need a change to allow link
chains, so that the original structure of the IANA database can be fully
represented by chrono::tzdb. The alternative would be for chrono::tzdb
to flatten all chains when loading the data, so that a link's target is
always a zone, but this means throwing away information present in the
tzdata.zi input file.

In anticipation of a change to the spec, this commit adds support for
chained links to libstdc++. When a name is found to be a link, we try to
find its target in the list of zones as before, but now if the target
isn't the name of a zone we don't fail. Instead we look for another link
with that name, and keep doing that until we reach the end of the chain
of links, and then look up the last target as a zone.

This new logic would get stuck in a loop if the tzdata.zi file is buggy
and defines a link chain that contains a cycle, e.g. two links that
refer to each other. To deal with that unlikely case, we use the
tortoise and hare algorithm to detect cycles in link chains, and throw
an exception if we detect a cycle. Cycles in links should never happen,
and it is expected that link chains will be short (if they occur at all)
and so the code is optimized for short chains without cycles. Longer
chains (four or more links) and cycles will do more work, but won't fail
to resolve a chain or get stuck in a loop.

The new test file checks various forms of broken links and cycles.

Also add a new check in the testsuite that every element in the
get_tzdb().zones and get_tzdb().links sequences can be successfully
found using locate_zone.

libstdc++-v3/ChangeLog:

PR libstdc++/114770
* src/c++20/tzdb.cc (do_locate_zone): Support links that have
another link as their target.
* testsuite/std/time/tzdb/1.cc: Check that all zones and links
can be found by locate_zone.
* testsuite/std/time/tzdb/links.cc: New test.

Update gcc sv.po

* sv.po: Update.

internal-fn: Fix up expand_arith_overflow [PR114753]

During backporting I've noticed I've missed one return spot for the
restoration of the original flag_trapv flag value.

2024-04-19 Jakub Jelinek <jakub@redhat.com>

PR middle-end/114753
* internal-fn.cc (expand_arith_overflow): Add one missing restore
of flag_trapv before return.

middle-end: refactory vect_recog_absolute_difference to simplify flow [PR114769]

Hi All,

As the reporter in PR114769 points out the control flow for the abd detection
is hard to follow.  This is because vect_recog_absolute_difference has two
different ways it can return true.

1. It can return true when the widening operation is matched, in which case
   unprom is set, half_type is not NULL and diff_stmt is not set.

2. It can return true when the widening operation is not matched, but the stmt
   being checked is a minus.  In this case unprom is not set, half_type is set
   to NULL and diff_stmt is set.  This because to get to diff_stmt you have to
   dig through the abs statement and any possible promotions.

This however leads to complicated uses of the function at the call sites as the
exact semantic needs to be known to use it safely.

vect_recog_absolute_difference has two callers:

1. vect_recog_sad_pattern where if you return true with unprom not set, then
   *half_type will be NULL.  The call to vect_supportable_direct_optab_p will
   always reject it since there's no vector mode for NULL.  Note that if looking
   at the dump files, the convention in the dump files have always been that we
   first indicate that a pattern could possibly be recognize and then check that
   it's supported.

   This change somewhat incorrectly makes the diagnostic message get printed for
   "invalid" patterns.

2. vect_recog_abd_pattern, where if half_type is NULL, it then uses diff_stmt to
   set them.

This refactors the code, it now only has 1 success condition, and diff_stmt is
always set to the minus statement in the abs if there is one.

The function now only returns success if the widening minus is found, in which
case unprom and half_type set.

This then leaves it up to the caller to decide if they want to do anything with
diff_stmt.

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/114769
* tree-vect-patterns.cc:
(vect_recog_absolute_difference): Have only one success condition.
(vect_recog_abd_pattern): Handle further checks if
vect_recog_absolute_difference fails.

Enable 'gcc.dg/pr114768.c' for nvptx target [PR114768]

Follow-up to commit 9f295847a9c32081bdd0fe908ffba58e830a24fb
"rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]": nvptx does
behave in the exactly same way as expected; see 'diff' of before vs. after the
'gcc/rtlanal.cc' code changes:

    PASS: gcc.dg/pr114768.c (test for excess errors)
    [-FAIL:-]{+PASS:+} gcc.dg/pr114768.c scan-rtl-dump final "\\(mem/v:"

    --- 0/pr114768.c.347r.final 2024-04-19 11:34:34.577037596 +0200
    +++ ./pr114768.c.347r.final 2024-04-19 12:08:00.118312524 +0200
    @@ -13,15 +13,27 @@
     ;;  entry block defs 1 [%stack] 2 [%frame] 3 [%args]
     ;;  exit block uses 1 [%stack] 2 [%frame]
     ;;  regs ever live
    -;;  ref usage r1={1d,2u} r2={1d,2u} r3={1d,1u}
    -;;    total ref usage 8{3d,5u,0e} in 1{1 regular + 0 call} insns.
    +;;  ref usage r1={1d,3u} r2={1d,3u} r3={1d,2u} r22={1d,1u} r23={1d,2u}
    +;;    total ref usage 16{5d,11u,0e} in 4{4 regular + 0 call} insns.
     (note 1 0 4 NOTE_INSN_DELETED)
     (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
    -(note 2 4 3 2 NOTE_INSN_DELETED)
    +(insn 2 4 3 2 (set (reg/v/f:DI 23 [ p ])
    +        (unspec:DI [
    +                (const_int 0 [0])
    +            ] UNSPEC_ARG_REG)) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":8:1 14 {load_arg_regdi}
    +     (nil))
     (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
    -(note 6 3 10 2 NOTE_INSN_DELETED)
    -(note 10 6 11 2 NOTE_INSN_EPILOGUE_BEG)
    -(jump_insn 11 10 12 2 (return) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
    +(insn 6 3 7 2 (set (reg:SI 22 [ _1 ])
    +        (mem/v:SI (reg/v/f:DI 23 [ p ]) [1 MEM[(volatile int *)p_3(D)]+0 S4 A32])) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:8 6 {*movsi_insn}
    +     (nil))
    +(insn 7 6 10 2 (set (mem:SI (reg/v/f:DI 23 [ p ]) [1 *p_3(D)+0 S4 A32])
    +        (reg:SI 22 [ _1 ])) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:6 6 {*movsi_insn}
    +     (expr_list:REG_DEAD (reg/v/f:DI 23 [ p ])
    +        (expr_list:REG_DEAD (reg:SI 22 [ _1 ])
    +            (nil))))
    +(note 10 7 13 2 NOTE_INSN_EPILOGUE_BEG)
    +(note 13 10 11 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
    +(jump_insn 11 13 12 3 (return) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
  (nil)
      -> return)
     (barrier 12 11 0)

    --- 0/pr114768.s 2024-04-19 11:34:34.577037596 +0200
    +++ ./pr114768.s 2024-04-19 12:08:00.118312524 +0200
    @@ -13,5 +13,10 @@
     {
.reg.u64 %ar0;
ld.param.u64 %ar0, [%in_ar0];
    + .reg.u32 %r22;
    + .reg.u64 %r23;
    + mov.u64 %r23, %ar0;
    + ld.u32 %r22, [%r23];
    + st.u32 [%r23], %r22;
ret;
     }

PR testsuite/114768
gcc/testsuite/
* gcc.dg/pr114768.c: Enable for nvptx target.

bpf: remove huge memory waste with string allocation.

The BPF backend was allocating an unnecessarily large string when
constructing CO-RE relocations for enum types.
This patch also verifies that those enumerators are valid for CO-RE,
returning an error otherwise.

gcc/ChangeLog:
* config/bpf/core-builtins.cc (get_index_for_enum_value): Create
function.
(pack_enum_value): Check for enumerator and error out.
(process_enum_value): Correct string allocation.

bpf: support more instructions to match CO-RE relocations

BPF supports multiple instructions to be CO-RE relocatable regardless of
the position of the immediate field in the encoding.
In particular, not only the MOV instruction allows a CO-RE
relocation of its immediate operand, but the LD and ST instructions can
have a CO-RE relocation happening to their offset immediate operand,
even though those operands are encoded in different encoding bits.
This patch moves matching from a more traditional matching of the
UNSPEC_CORE_RELOC pattern within a define_insn to a match within the
constraints of both immediates and address operands from more generic
mov define_insn rule.

gcc/Changelog:
* config/bpf/bpf-protos.h (bpf_add_core_reloc): Renamed function
to bpf_output_move.
* config/bpf/bpf.cc (bpf_legitimate_address_p): Allow
UNSPEC_CORE_RELOC to match an address.
(bpf_insn_cost): Make UNSPEC_CORE_RELOC immediate moves
expensive to prioritize loads and stores.
(TARGET_INSN_COST): Add hook.
(bpf_output_move): Wrapper to call bpf_output_core_reloc.
(bpf_print_operand): Add support to print immediate operands
specified with the UNSPEC_CORE_RELOC.
(bpf_print_operand_address): Likewise, but to support
UNSPEC_CORE_RELOC in addresses.
(bpf_init_builtins): Flag BPF_BUILTIN_CORE_RELOC as NOTHROW.
* config/bpf/bpf.md: Wrap patterns for MOV, LD and ST
instruction with bpf_output_move call.
(mov_reloc_core<MM:mode>): Remove now spurious define_insn.
* config/bpf/constraints.md: Added "c" and "C" constraints to
match immediates represented with UNSPEC_CORE_RELOC.
* config/bpf/core-builtins.cc (bpf_add_core_reloc): Remove
(bpf_output_core_reloc): Add function to create the CO-RE
relocations based on new matching rules.
* config/bpf/core-builtins.h (bpf_output_core_reloc): Add
prototype.
* config/bpf/predicates.md (core_imm_operand) Add predicate.
(mov_src_operand): Add match for core_imm_operand.

gcc/testsuite/ChangeLog:
* gcc.target/bpf/btfext-funcinfo.c: Updated to changes.
* gcc.target/bpf/core-builtin-fieldinfo-const-elimination.c:
Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-existence-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-lshift-1-be.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-lshift-1-le.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-lshift-2.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-rshift-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-rshift-2.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-sign-1.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-sign-2.c: Likewise.
* gcc.target/bpf/core-builtin-fieldinfo-size-1.c: Likewise.