Kwok Cheung Yeung [Fri, 22 Nov 2019 17:02:19 +0000 (22 09:02 -0800)]
[og9] Fix libgomp.oacc-fortran/lib-16.f90 test
2019-11-22 Kwok Cheung Yeung <kcy@codesourcery.com>
libgomp/
* testsuite/libgomp.oacc-fortran/lib-16.f90: Fix async-safety issue.
Kwok Cheung Yeung [Fri, 22 Nov 2019 17:22:48 +0000 (22 09:22 -0800)]
Add missing ChangeLog.openacc entry
Add missing ChangeLog.openacc entry for the commit '[og9] Backport AMD GCN
backend improvements from mainline'.
Kwok Cheung Yeung [Mon, 18 Nov 2019 21:26:50 +0000 (18 13:26 -0800)]
[og9] Backport AMD GCN backend improvements from mainline
2019-11-07 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* ira.c (setup_alloc_regs): Setup no_unit_alloc_regs for
frame pointer in multiple registers.
(ira_setup_eliminable_regset): Setup eliminable_regset,
ira_no_alloc_regs and regs_ever_live for frame pointer in
multiple registers.
2019-11-10 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* lra-spills.c (assign_spill_hard_regs): Do not spill into
registers in eliminable_regset.
2019-11-14 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* lra-spills.c (assign_spill_hard_regs): Check that the spill
register is suitable for the mode.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG
register class for VCC_LO and VCC_HI.
(gcn_spill_class): Use SGPR_REGS to spill registers in
VCC_CONDITIONAL_REG.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_expand_prologue): Remove initialization and
prologue use of v0.
(print_operand_address): Use v1 for zero vector offset.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_init_cumulative_args): Call reinit_regs.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (default_requested_args): New.
(gcn_parse_amdgpu_hsa_kernel_attribute): Initialize requested args
set with default_requested_args.
(gcn_conditional_register_usage): Limit register usage of non-kernel
functions. Reassign fixed registers if a non-standard set of args is
requested.
* config/gcn/gcn.h (FIXED_REGISTERS): Fix registers according to ABI.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT, MAX_NORMAL_VGPR_COUNT): New.
(gcn_conditional_register_usage): Use constants in place of hard-coded
values.
(gcn_hsa_declare_function_name): Set lower bound for number of
SGPRs/VGPRs in non-leaf kernels to MAX_NORMAL_SGPR_COUNT and
MAX_NORMAL_VGPR_COUNT.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.h (FIXED_REGISTERS): Unfix frame pointer.
(CALL_USED_REGISTERS): Make frame pointer callee-saved.
Julian Brown [Mon, 14 Oct 2019 20:12:39 +0000 (14 13:12 -0700)]
[og9] Re-do OpenACC private variable resolution
gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename
to...
(gcn_goacc_adjust_private_decl): ...this.
* config/gcn/gcn-tree.c (diagnostic-core.h): Include.
(gcn_goacc_adjust_gangprivate_decl): Rename to...
(gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter.
* config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* config/nvptx/nvptx.c (tree-pretty-print.h): Include.
(nvptx_goacc_adjust_private_decl): New function.
(TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook using above function.
* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename to...
(TARGET_GOACC_ADJUST_PRIVATE_DECL): ...this.
* doc/tm.texi: Regenerated.
* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
* omp-low.c (omp_context): Remove oacc_partitioning_levels field.
(lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before
fork.
(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify its
gimple call arguments as appropriate. Don't set
oacc_partitioning_levels in omp_context. Pass private_marker to
lower_oacc_reductions.
(oacc_record_private_var_clauses): Don't check for NULL ctx.
(make_oacc_private_marker): New function.
(lower_omp_for): Only call oacc_record_vars_in_bind for
OpenACC contexts. Create private marker and pass to
lower_oacc_head_tail.
(lower_omp_target): Remove unnecessary call to
oacc_record_private_var_clauses. Remove call to mark_oacc_gangprivate.
Create private marker and pass to lower_oacc_reductions.
(process_oacc_gangprivate_1): Remove.
(lower_omp_1): Only call oacc_record_vars_in_bind for OpenACC. Don't
iterate over contexts calling process_oacc_gangprivate_1.
(omp-offload.c (oacc_loop_xform_head_tail): Treat
private-variable markers like fork/join when transforming head/tail
sequences.
(execute_oacc_device_lower): Use IFN_UNIQUE_OACC_PRIVATE instead of
"oacc gangprivate" attributes to determine partitioning level of
variables.
* omp-sese.c (find_gangprivate_vars): New function.
(find_local_vars_to_propagate): Use GANGPRIVATE_VARS parameter instead
of "oacc gangprivate" attribute to determine which variables are
gang-private.
(oacc_do_neutering): Use find_gangprivate_vars.
* target.def (adjust_gangprivate_decl): Rename to...
(adjust_private_decl): ...this. Update documentation (briefly).
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Use
oaccdevlow dump and update scanned output.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
Add missing atomic to force worker partitioning for test variable.
Julian Brown [Wed, 16 Oct 2019 15:28:32 +0000 (16 08:28 -0700)]
[og9] Fix libgomp serial-dims.c test for AMD GCN
libgomp/
* testsuite/libgomp.oacc-c-c++-common/serial-dims.c: Support AMD GCN.
Tobias Burnus [Wed, 9 Oct 2019 20:26:19 +0000 (9 22:26 +0200)]
Fix OpenMP's use_device_ptr with Fortran array descriptors
gcc/fortran
* f95-lang.c (LANG_HOOKS_OMP_ARRAY_DATA): Set to gfc_omp_array_data.
* trans-array.c (gfc_conv_descriptor_data_get): Handle ref types.
* trans-openmp.c (gfc_omp_array_data): New.
* trans.h (gfc_omp_array_data): Declare.
gcc/
* hooks.c (hook_tree_tree_null): New.
* hooks.h (hook_tree_tree_null): Declare.
* langhooks-def.h (LANG_HOOKS_OMP_ARRAY_DATA): Define.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Use it.
* langhooks.h (lang_hooks_for_types): Add omp_array_data.
* omp-general.c (omp_is_optional_argument): Handle value+optional.
* omp-low.c (omp_context): Add array_data_map + present_map.
(install_var_field): Handle array descriptors.
(delete_omp_context): Free new maps.
(scan_sharing_clauses): Handle array descriptors.
(lower_omp_target): Ditto. Fix optional-arg present check.
gcc/testsuite/
* gfortran.dg/gomp/use_device_ptr1.f90: New.
* gfortran.dg/gomp/use_device_ptr2.f90: New.
* gfortran.dg/gomp/use_device_ptr3.f90: New.
libgomp/
* testsuite/libgomp.fortran/use_device_ptr1.f90: New.
Tobias Burnus [Tue, 8 Oct 2019 12:44:53 +0000 (8 14:44 +0200)]
Fortran - fix OpenMP 'target simd'
Backported from mainline.
gcc/fortran/
* parse.c (parse_executable): Add missing ST_OMP_TARGET_SIMD.
libgomp/
* testsuite/libgomp.fortran/target-simd.f90: New.
Tobias Burnus [Tue, 8 Oct 2019 12:08:49 +0000 (8 14:08 +0200)]
Fortran - Improve OpenMP/OpenACC diagnostic
Backported from mainline.
gcc/fortran/
* match.h (gfc_match_omp_eos_error): Renamed from gfc_match_omp_eos.
* openmp.c (gfc_match_omp_eos): Make static.
(gfc_match_omp_eos_error): New.
* parse.c (matchs, matchdo, matchds): Do as done for 'matcho' -
if error occurred after OpenMP/OpenACC directive matched, do not
try other directives.
(decode_oacc_directive, decode_omp_directive): Call new function
instead.
testsuite/
* gfortran.dg/goacc/continuation-free-form.f95: Update dg-error.
Tobias Burnus [Wed, 2 Oct 2019 12:55:22 +0000 (2 14:55 +0200)]
Backport Fortran OMG/ACC diagnositic patch
2019-10-02 Tobias Burnus <tobias@codesourcery.com>
Backported from mainline
2019-10-02 Tobias Burnus <tobias@codesourcery.com>
* openmp.c (gfc_match_omp_clauses): Show a clause-parsing
error if none was rised before.
* parse.c (matcha, matcho): If error occurred after
OpenMP/OpenACC directive matched, do not try other directives.
2019-10-02 Tobias Burnus <tobias@codesourcery.com>
Backported from mainline
2019-10-02 Tobias Burnus <tobias@codesourcery.com>
* gfortran.dg/goacc/asyncwait-1.f95: Handle new error message.
* gfortran.dg/goacc/asyncwait-2.f95: Likewise.
* gfortran.dg/goacc/asyncwait-3.f95: Likewise.
* gfortran.dg/goacc/asyncwait-4.f95: Likewise.
* gfortran.dg/goacc/default-2.f: Likewise.
* gfortran.dg/goacc/enter-exit-data.f95: Likewise.
* gfortran.dg/goacc/if.f95: Likewise.
* gfortran.dg/goacc/list.f95: Likewise.
* gfortran.dg/goacc/literal.f95: Likewise.
* gfortran.dg/goacc/loop-2-kernels-tile.f95: Likewise.
* gfortran.dg/goacc/loop-2-parallel-tile.f95: Likewise.
* gfortran.dg/goacc/loop-7.f95: Likewise.
* gfortran.dg/goacc/parallel-kernels-clauses.f95: Likewise.
* gfortran.dg/goacc/routine-6.f90: Likewise.
* gfortran.dg/goacc/several-directives.f95: Likewise.
* gfortran.dg/goacc/sie.f95: Likewise.
* gfortran.dg/goacc/tile-1.f90: Likewise.
* gfortran.dg/goacc/update-if_present-2.f90: Likewise.
* gfortran.dg/gomp/declare-simd-1.f90: Likewise.
* gfortran.dg/gomp/pr29759.f90: Likewise.
Julian Brown [Fri, 20 Sep 2019 20:53:10 +0000 (20 13:53 -0700)]
[og9] Handle references in OpenACC "private" clauses
gcc/
* gimplify.c (localize_reductions): Rewrite references for
OMP_CLAUSE_PRIVATE also.
libgomp/
* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test.
* testsuite/libgomp.oacc-c++/privatized-ref-3.C: New test.
Tobias Burnus [Fri, 20 Sep 2019 16:23:38 +0000 (20 18:23 +0200)]
Backport from mainline
2019-09-20 Tobias Burnus <tobias@codesourcery.com>
* openmp.c (gfc_resolve_oacc_declare): Reject all
non variables but accept function result variables.
* trans-openmp.c (gfc_trans_omp_clauses): Handle
function-result variables for remaing cases.
2019-09-20 Tobias Burnus <tobias@codesourcery.com>
* gfortran.dg/goacc/parameter.f95: Change
dg-error as it is now detected earlier.
* gfortran.dg/goacc/pr85701.f90: Modify to
use a separate result variable.
* gfortran.dg/goacc/pr78260.f90: New.
* gfortran.dg/goacc/pr78260-2.f90: New.
* gfortran.dg/gomp/pr78260.f90: New.
* gfortran.dg/gomp/pr78260-2.f90: New.
* gfortran.dg/gomp/pr78260-3.f90: New.
Julian Brown [Thu, 19 Sep 2019 12:26:44 +0000 (19 05:26 -0700)]
[og9] Add 'ephemeral' parameter to GOMP_OFFLOAD_openacc_async_host2dev
libgomp/
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_host2dev):
Add EPHEMERAL parameter, and FIXME function comment.
Tobias Burnus [Thu, 19 Sep 2019 13:57:08 +0000 (19 15:57 +0200)]
Reduce testsuite fails
gcc/testsuite/
2019-09-19 Tobias Burnus <tobias@codesourcery.com>
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add
one dg-message for additional -fopt-info-optimized-omp output.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-1.f95: Change 'note' to
'optimized' in dg-message.
Tobias Burnus [Wed, 18 Sep 2019 11:45:34 +0000 (18 13:45 +0200)]
libgomp - fix dg-warning line numbers
libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Fix dg-warning
line numbers.
* testsuite/libgomp.oacc-c-c++-common/serial-dims.c: Likewise.
Tobias Burnus [Wed, 18 Sep 2019 08:27:39 +0000 (18 10:27 +0200)]
Use PRId64 if available
libgomp/
2019-09-18 Tobias Burnus <tobias@codesourcery.com>
* linux/gomp_print.c (gomp_print_integer): Use PRId64 if available,
otherwise cast for %ld.
Tobias Burnus [Wed, 18 Sep 2019 06:44:20 +0000 (18 08:44 +0200)]
Silence compiler warnings
gcc/
2019-09-17 Tobias Burnus <tobias@codesourcery.com>
* config/gcn/gcn.c (gcn_expand_scalar_to_vector_address,
gcn_md_reorg): Remove unused statement.
(gcn_emutls_var_init): Add missing return - after sorry abort.
* config/gcn/gcn.md (movdi_symbol_save_scc): Fix condition.
* config/gcn/mkoffload.c (process_obj): Remove unused variables.
* gimplify.c (gomp_oacc_needs_data_present): Likewise.
(gimplify_adjust_omp_clauses): Fix condition by adding ().
* omp-low.c (process_oacc_gangprivate_1): Comment unused
parameter name to silence unused warning.
* omp-sese.c (omp_sese_number, omp_sese_pseudo): Remove
superfluous ().
(oacc_do_neutering): Use signed int to avoid a warning.
* tree-ssa-structalias.c (find_func_aliases_for_builtin_call,
find_func_clobbers): Use unsigned to silence warning.
gcc/fortran/
2019-09-17 Tobias Burnus <tobias@codesourcery.com>
* trans-expr.c (gfc_auto_dereference_var): Use passed loc argument.
Julian Brown [Wed, 11 Sep 2019 20:22:03 +0000 (11 13:22 -0700)]
[og9] Fix OpenACC "ephemeral" asynchronous host-to-device copies
libgomp/
* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_host2dev): Update
prototype.
* libgomp.h (gomp_copy_host2dev): Update prototype.
* oacc-host.c (host_openacc_async_host2dev): Add ephemeral parameter.
* oacc-mem.c (memcpy_tofrom_device): Update call to gomp_copy_host2dev.
(update_dev_host): Likewise.
* oacc-parallel.c (GOACC_enter_exit_data): Call async versions of
acc_attach/acc_detach/acc_detach_finalize functions.
* plugin/plugin-gcn.c (wait_for_queue_nonfull): Don't lock/unlock
aq->mutex here.
(queue_push_launch): Lock aq->mutex before calling
wait_for_queue_nonfull.
(queue_push_callback): Likewise.
(queue_push_asyncwait): Likewise.
(queue_push_placeholder): Likewise.
(GOMP_OFFLOAD_openacc_async_host2dev): Add ephemeral parameter. Copy
source data to temporary space immediately if true, and pass to
queue_push_copy.
(goacc_device_copy_async): Remove.
(gomp_copy_host2dev): Add ephemeral parameter. Update function comment.
Call async host2dev plugin hook directly.
(gomp_copy_dev2host): Call async dev2host plugin hook directly.
(gomp_map_vars_existing, gomp_map_pointer, gomp_attach_pointer,
gomp_detach_pointer): Update calls to gomp_copy_host2dev.
(gomp_map_vars_internal): Don't use coalescing buffer for asynchronous
copies. Update calls to gomp_copy_host2dev.
(gomp_update): Update calls to gomp_copy_host2dev.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c (main): Fix
async-safety issue. Increase number of iterations.
* testsuite/libgomp.oacc-fortran/lib-16-2.f90: Fix async-safety issue.
Julian Brown [Wed, 11 Sep 2019 03:34:45 +0000 (10 20:34 -0700)]
[og9] OpenACC profiling-interface fixes for asynchronous operations
libgomp/
* oacc-host.c (host_openacc_async_queue_callback): Invoke callback
function immediately.
* oacc-parallel.c (struct async_prof_callback_info, async_prof_dispatch,
queue_async_prof_dispatch): New.
(GOACC_parallel_keyed): Call queue_async_prof_dispatch for asynchronous
profile-event dispatches.
(GOACC_enter_exit_data): Likewise.
(GOACC_update): Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c
(cb_compute_construct_start): Remove/fix TODO.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c
(cb_exit_data_start): Tweak expected state values.
(cb_exit_data_end): Likewise.
(cb_compute_construct_start): Remove/fix TODO.
(cb_compute_construct_end): Don't do adjustments for
acc_ev_enqueue_launch_start/acc_ev_enqueue_launch_end callbacks.
(cb_compute_construct_end): Tweak expected state values.
(cb_enqueue_launch_start, cb_enqueue_launch_end): Don't expect
launch-enqueue operations to happen synchronously with respect to
profiling events on async streams.
(main): Tweak expected state values.
* testsuite/libgomp.oacc-c-c++-common/lib-94.c (main): Reorder
operations for async-safety.
Julian Brown [Mon, 16 Sep 2019 20:02:31 +0000 (16 13:02 -0700)]
[og9] Fix uninitialised read in gomp_map_vars_internal
libgomp/
* target.c (gomp_map_vars_internal): Remove read of uninitialised
data.
Julian Brown [Fri, 13 Sep 2019 01:03:17 +0000 (12 18:03 -0700)]
[og9] Update expected messages, errors and warnings for "kernels" tests
gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c: Update expected
message/warning/error output.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/kernels-decompose.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Likewise.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Likewise.
* c-c++-common/goacc/routine-1.c: Likewise.
* c-c++-common/goacc/routine-4-extern.c: Likewise.
Julian Brown [Wed, 11 Sep 2019 15:31:38 +0000 (11 08:31 -0700)]
[og9] A couple of GCN-specific test fixes
libgomp/
* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Only run
NVidia-specific test on NVidia hardware.
* testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c (main):
Initialise for acc_device_gcn if testing on AMD GCN.
* testsuite/libgomp.oacc-c-c++-common/function-not-offloaded.c: Support
AMD GCN.
* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c (check): Skip
vector dimension test for AMD GCN.
Tobias Burnus [Thu, 12 Sep 2019 16:07:53 +0000 (12 18:07 +0200)]
libgomp plugin-gcn - init string
libgomp/
2019-09-13 Tobias Burnus <tobias@codesourcery.com>
* plugin/plugin-gcn.c (hsa_warn, hsa_fatal, hsa_error): Ensure
string is initialized.
Julian Brown [Sun, 8 Sep 2019 23:15:16 +0000 (8 16:15 -0700)]
[og9] Clean up dead/write-only fields in GCN libgomp plugin
gcc/
* config/gcn/mkoffload.c (process_asm): Remove omp_data_size,
gridified_kernel_p, kernel_dependencies_count, kernel_dependencies
from emitted hsa_kernel_description struct array.
libgomp/
* plugin/plugin-gcn.c (GOMP_hsa_kernel_dispatch): Remove
omp_data_memory, kernel_dispatch_count, debug, omp_level,
children_dispatches and omp_num_threads fields.
(hsa_kernel_description): Remove omp_data_size, gridified_kernel_p,
kernel_dependencies_count, kernel_dependencies fields to match
mkoffload output.
(kernel_info): Remove omp_data_size, dependencies, dependencies_count,
max_omp_data_size and gridified_kernel_p fields.
(init_basic_kernel_info): Don't copy newly-deleted fields.
(create_single_kernel_dispatch): Remove omp_data_size parameter.
Remove write-only initialization of deleted GOMP_hsa_kernel_dispatch
fields.
(release_kernel_dispatch): Update debug output. Don't free deleted
omp_data_memory field.
(init_single_kernel): Remove max_omp_data_size parameter. Remove deleted
fields from debug output.
(print_kernel_dispatch): Don't print deleted fields.
(create_kernel_dispatch): Remove omp_data_size parameter.
(init_kernel): Update calls to init_single_kernel and
create_kernel_dispatch.
Julian Brown [Sun, 8 Sep 2019 23:04:54 +0000 (8 16:04 -0700)]
[og9] Improve async serialize implementation for AMD GCN libgomp plugin
libgomp/
* plugin/plugin-gcn.c (struct placeholder, struct asyncwait_info,
enum entry_type): New.
(queue_entry): Use entry_type enum for tag. Add asyncwait and
placeholder event type fields.
(wait_for_queue_nonfull): New function.
(queue_push_launch): Use above function instead of raising a fatal
error on queue-full condition. Use KERNEL_LAUNCH instead of hardwired
0.
(queue_push_callback): Use wait_for_queue_nonfull instead of open-coded
wait sequence. Use CALLBACK instead of hardwired 1.
(queue_push_asyncwait, queue_push_placeholder): New.
(execute_queue_entry): Implement ASYNC_WAIT and ASYNC_PLACEHOLDER event
types.
(GOMP_OFFLOAD_openacc_async_serialize): Use queue_push_placeholder and
queue_push_asyncwait instead of host-synchronized wait_queue calls.
* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c (main): Add missing
asynchronous waits.
* testsuite/libgomp.oacc-c-c++-common/data-2.c (main): Likewise.
Julian Brown [Tue, 10 Sep 2019 15:33:48 +0000 (10 08:33 -0700)]
[og9] Fix src_copy mismerge in GOMP_OFFLOAD_openacc_async_host2dev
libgomp/
* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_host2dev): Enqueue
copy from src_copy not src.
Kwok Cheung Yeung [Tue, 30 Jul 2019 14:10:53 +0000 (30 07:10 -0700)]
Fix memory leak in libgomp when using OpenMP
2019-09-10 Kwok Cheung Yeung <kcy@codesourcery.com>
libgomp/
* config/gcn/team.c (gomp_gcn_exit_kernel): Free GCN thread list.
Andrew Stubbs [Thu, 25 Jul 2019 10:26:45 +0000 (25 11:26 +0100)]
Detect number of GPU compute units.
2019-09-10 Andrew Stubbs <ams@codesourcery.com>
libgomp/
* plugin/plugin-gcn.c (HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT): Define.
(dump_hsa_agent_info): Dump compute unit count.
(get_cu_count): New function.
(parse_target_attributes): Use get_cu_count for default gdims.
(gcn_exec): Likewise.
Andrew Stubbs [Fri, 19 Jul 2019 16:06:50 +0000 (19 17:06 +0100)]
Use GFX9 granulated sgprs count correctly.
2019-09-10 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_hsa_declare_function_name): Calculate
granulated_sgprs according to architecture.
Andrew Stubbs [Fri, 19 Jul 2019 14:45:07 +0000 (19 15:45 +0100)]
Fix relocations with multiple devices.
2019-09-10 Andrew Stubbs <ams@codesourcery.com>
libgomp/
* plugin/plugin-gcn.c (obstack_chunk_alloc): Delete.
(obstack_chunk_free): Delete.
(obstack.h): Remove include.
(create_and_finalize_hsa_program): Remove all unmodified_sections_os
and use sections directly from the issue.
Use "or 0x80" instead of SHT_NOTE to hide relocations, and then
simply recognise that ourselves.
Andrew Stubbs [Fri, 19 Jul 2019 11:00:53 +0000 (19 12:00 +0100)]
Move offload data into GPU memory.
2019-09-09 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-run.c (heap_region): New global variable.
(struct hsa_runtime_fn_info): Add hsa_memory_assign_agent_fn.
(init_hsa_runtime_functions): Initialize hsa_memory_assign_agent.
(get_kernarg_region): Move contents to ....
(get_memory_region): .... here.
(get_heap_region): New function.
(init_device): Initialize the heap_region.
(device_malloc): Add region parameter.
(struct kernargs): Move heap ....
(heap): ... to global scope.
(main): Allocate heap separate to kernargs.
libgomp/
* plugin/plugin-gcn.c (struct hsa_runtime_fn_info): Add
hsa_memory_assign_agent_fn.
(struct agent_info): Add data_region.
(init_hsa_runtime_functions): Initialize hsa_memory_assign_agent.
(get_kernarg_memory_region): Move contents to new function ...
(get_memory_region): ... here.
(get_data_memory_region): New function.
(GOMP_OFFLOAD_get_property): Use data_region, not kernarg_region.
(GOMP_OFFLOAD_init_device): Initialize data_region.
(create_and_finalize_hsa_program): Use data_region, not
kernarg_region, and assign heap to device agent.
(GOMP_OFFLOAD_alloc_by_agent): Likewise.
(image_address_p): Delete function.
(struct copy_data): Remove use_hsa_memory_copy.
(copy_data): Always use hsa_memory_copy.
(queue_push_copy): Remove use_hsa_memory_copy.
(GOMP_OFFLOAD_dev2host): Always use hsa_memory_copy.
(GOMP_OFFLOAD_host2dev): Likewise.
(GOMP_OFFLOAD_dev2dev): Likewise.
(gcn_exec): Use hsa_memory_copy.
(GOMP_OFFLOAD_openacc_async_host2dev): Always use hsa_memory_copy.
(GOMP_OFFLOAD_openacc_async_dev2host): Likewise.
ams [Thu, 6 Jun 2019 15:11:59 +0000 (6 15:11 +0000)]
Add -march=gfx906 for AMD GCN.
2019-09-06 Andrew Stubbs <ams@codesourcery.com>
Backport from mainline:
2019-06-06 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config.gcc (amdgcn-*-*): Allow --with-arch=gfx906.
* config/gcn/gcn.opt (gpu_type): Add gfx906.
* config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Add gfx906 multilib.
(MULTILIB_DIRNAMES): Rename gcn5 to gfx900.
Add gfx906.
2019-06-07 Andrew Stubbs <ams@codesourcery.com>
gcc/
* doc/invoke.texi (AMD GCN Options): Add gfx906.
Julian Brown [Fri, 29 Jun 2018 19:16:11 +0000 (29 12:16 -0700)]
[og9] OpenACC profiling support for AMD GCN
2019-09-06 Julian Brown <julian@codesourcery.com>
libgomp/
* plugin/plugin-gcn.c (GOMP_OFFLOAD_alloc_by_agent,
GOMP_OFFLOAD_free, gcn_exec): Add profiling support.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Add GCN
support.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Likewise.
Julian Brown [Fri, 6 Sep 2019 15:46:20 +0000 (6 08:46 -0700)]
[og9] Use more appropriate var in localize_reductions call
gcc/
* gimplify.c (gimplify_omp_for): Use for_stmt in call to
localize_reductions.
Julian Brown [Fri, 6 Sep 2019 00:16:19 +0000 (5 17:16 -0700)]
[og9] Add omp_pause_resource{,_all} for AMD GCN
libgomp/
* config/gcn/target.c (omp_pause_resource, omp_pause_resource_all): New
functions, plus ialiases.
ams [Fri, 5 Jul 2019 16:00:46 +0000 (5 16:00 +0000)]
Tweak error message for mapped parameters.
2019-09-06 Andrew Stubbs <ams@codesourcery.com>
Backport from mainline:
2019-07-05 Andrew Stubbs <ams@codesourcery.com>
gcc/fortran/
* openmp.c (resolve_omp_clauses): Add custom error messages for
parameters in map clauses.
Julian Brown [Fri, 6 Sep 2019 11:53:17 +0000 (6 04:53 -0700)]
[og9] Remove duplicate SESE code in NVPTX backend
gcc/
* config/nvptx/nvptx.c (omp-sese.h): Include.
(bb_pair_t, bb_pair_vec_t, pseudo_node_t, bracket, bracket_vec_t,
bb_sese, bb_sese::~bb_sese, bb_sese::append, bb_sese::remove,
BB_SET_SESE, BB_GET_SESE, nvptx_sese_number, nvptx_sese_pseudo,
nvptx_sese_color, nvptx_find_sese): Remove.
(nvptx_neuter_pars): Call omp_find_sese instead of nvptx_find_sese.
* omp-sese.c (omp-sese.h): Include.
(struct parallel): Rename to...
(struct parallel_g): This.
(parallel::parallel, parallel::~parallel): Rename to...
(parallel_g::parallel_g, parallel_g::~parallel_g): These.
(omp_sese_dump_pars, omp_sese_find_par, omp_sese_discover_pars,
populate_single_mode_bitmaps, find_ssa_names_to_propagate,
find_partitioned_var_uses, find_local_vars_to_propagate,
neuter_worker_single): Update for parallel_g name change.
(bb_pair_t, bb_pair_vec_t): Remove.
(omp_find_sese): Make global.
* omp-sese.h (bb_pair_t, bb_pair_vec_t): New.
(omp_find_sese): Add prototype.
Julian Brown [Fri, 6 Sep 2019 11:42:16 +0000 (6 04:42 -0700)]
[og9] Fix tree check failure with reduction localization
gcc/
* gimplify.c (gimplify_omp_workshare): Use OMP_CLAUSES, OMP_BODY
instead of OMP_TARGET_CLAUSES, OMP_TARGET_BODY.
Andrew Stubbs [Thu, 5 Sep 2019 14:43:19 +0000 (5 15:43 +0100)]
Backport expcnt patches.
2019-09-05 Andrew Stubbs <ams@codesourcery.com>
Backport from mainline:
2019-07-31 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md
(scatter<mode>_insn_1offset<exec_scatter>): Remove s_waitcnt.
(scatter<mode>_insn_1offset_ds<exec_scatter>): Likewise.
(scatter<mode>_insn_2offsets<exec_scatter>): Likewise.
* config/gcn/gcn.c (gcn_md_reorg): Add delayeduse and reads to
struct ilist. Add nops for delayeduse insns.
* config/gcn/gcn.md (delayeduse): New attribute.
(*movbi): Remove s_waitcnt from stores.
(*mov<mode>_insn): Likewise.
(*movti_insn): Likewise. Add delayeduse attribute.
(sync_compare_and_swap<mode>_insn): Add delayeduse attribute.
(atomic_store<mode>): Remove or adjust s_waitcnt.
2019-09-05 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn.md (*movti_insn): Set delayeduse for global_store.
(sync_compare_and_swap<mode>_insn): Likewise.
Julian Brown [Tue, 3 Sep 2019 15:57:39 +0000 (3 08:57 -0700)]
[og9] Enable worker partitioning for AMD GCN
gcc/
* config/gcn/gcn.c (gcn_goacc_validate_dims): Remove
no-flag_worker-partitioning assertion.
(TARGET_GOACC_WORKER_PARTITIONING): Define target hook to true.
* config/gcn/gcn.opt (flag_worker_partitioning): Change default to 1.
libgomp/
* plugin/plugin-gcn.c (gcn_exec): Change default number of workers to
16.
Julian Brown [Tue, 3 Sep 2019 15:54:28 +0000 (3 08:54 -0700)]
[og9] Reference reduction localization
gcc/
* gimplify.c (privatize_reduction): New struct.
(localize_reductions_r, localize_reductions): New functions.
(gimplify_omp_for): Call localize_reductions.
(gimplify_omp_workshare): Likewise.
* omp-low.c (lower_oacc_reductions): Handle localized reductions.
Create fewer temp vars.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE_REDUCTION_PRIVATE_DECL
documentation.
* tree.c (omp_clause_num_ops): Bump number of ops for
OMP_CLAUSE_REDUCTION to 6.
(walk_tree_1): Adjust accordingly.
* tree.h (OMP_CLAUSE_REDUCTION_PRIVATE_DECL): Add macro.
Julian Brown [Tue, 3 Sep 2019 20:37:50 +0000 (3 13:37 -0700)]
[og9] Fix up tests for oaccdevlow pass splitting
gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c,
c-c++-common/goacc/classify-kernels.c,
c-c++-common/goacc/classify-parallel.c,
c-c++-common/goacc/classify-routine.c,
gfortran.dg/goacc/classify-kernels-unparallelized.f95,
gfortran.dg/goacc/classify-kernels.f95,
gfortran.dg/goacc/classify-parallel.f95,
gfortran.dg/goacc/classify-routine.f95: Scan oaccloops dump instead of
oaccdevlow pass.
Julian Brown [Wed, 4 Sep 2019 23:33:02 +0000 (4 16:33 -0700)]
[og9] AMD GCN adjustments for middle-end worker partitioning
gcc/
* config/gcn/gcn-protos.h (gcn_goacc_adjust_propagation_record): Rename
prototype to...
(gcn_goacc_create_propagation_record): This.
* config/gcn/gcn-tree.c (gcn_goacc_adjust_propagation_record): Rename
function to...
(gcn_goacc_create_propagation_record): This. Adjust comment.
* config/gcn/gcn.c (gcn_init_builtins): Override decls for
BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START,
BUILT_IN_GOACC_SINGLE_COPY_END and BUILT_IN_GOACC_BARRIER.
(gcn_fork_join): Remove inaccurate comment.
(TARGET_GOACC_ADJUST_PROPAGATION_RECORD): Rename to...
(TARGET_GOACC_CREATE_PROPAGATION_RECORD): This.
Julian Brown [Wed, 11 Oct 2017 15:07:18 +0000 (11 08:07 -0700)]
[og9] OpenACC middle-end worker-partitioning support
gcc/
* Makefile.in (OBJS): Add omp-sese.o.
* omp-builtins.def (BUILT_IN_GOACC_BARRIER, BUILT_IN_GOACC_SINGLE_START,
BUILT_IN_GOACC_SINGLE_COPY_START, BUILT_IN_GOACC_SINGLE_COPY_END): New
builtins.
* omp-offload.c (omp-sese.h): Include header.
(oacc_loop_xform_head_tail): Call update_stmt for modified builtin
calls.
(oacc_loop_process): Likewise.
(default_goacc_create_propagation_record): New default implementation
for TARGET_GOACC_CREATE_PROPAGATION_RECORD hook.
(execute_oacc_loop_designation): New. Split out of oacc_device_lower.
(execute_oacc_gimple_workers): New. Likewise.
(execute_oacc_device_lower): Recreate dims array.
(pass_data_oacc_loop_designation, pass_data_oacc_gimple_workers): New.
(pass_oacc_loop_designation, pass_oacc_gimple_workers): New.
(make_pass_oacc_loop_designation, make_pass_oacc_gimple_workers): New.
* omp-offload.h (oacc_fn_attrib_level): Add prototype.
* omp-sese.c: New file.
* omp-sese.h: New file.
* passes.def (pass_oacc_loop_designation, pass_oacc_gimple_workers):
Add passes.
* target.def (worker_partitioning, create_propagation_record): Add
target hooks.
* targhooks.h (default_goacc_create_propagation_record): Add prototype.
* tree-pass.h (make_pass_oacc_loop_designation,
make_pass_oacc_gimple_workers): Add prototypes.
* doc/tm.texi.in (TARGET_GOACC_WORKER_PARTITIONING,
TARGET_GOACC_CREATE_PROPAGATION_RECORD): Add documentation hooks.
* doc/tm.texi: Regenerate.
Julian Brown [Tue, 3 Sep 2019 23:35:10 +0000 (3 16:35 -0700)]
[og9] Target-dependent gang-private variable decl rewriting
gcc/
* omp-offload.c (convert.h): Include.
(struct addr_expr_rewrite_info): Add struct.
(rewrite_addr_expr): New function.
(is_sync_builtin_call): New function.
(execute_oacc_device_lower): Support rewriting gang-private variables
using target hook, and fix up addr_expr nodes afterwards.
* target.def (adjust_gangprivate_decl): New target hook.
* doc/tm.texi.in (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Document new
target hook.
* doc/tm.texi: Regenerate.
Julian Brown [Tue, 3 Sep 2019 14:57:05 +0000 (3 07:57 -0700)]
[og9] Fix libgomp.oacc-fortran/lib-13.f90 async bug
libgomp/
* testsuite/libgomp.oacc-fortran/lib-13.f90: End data region after
wait API calls.
Julian Brown [Tue, 13 Aug 2019 20:13:30 +0000 (13 13:13 -0700)]
[og9] Wait on queue-full condition in AMD GCN libgomp offloading plugin
libgomp/
* plugin/plugin-gcn.c (queue_push_callback): Wait on queue-full
condition.
Julian Brown [Tue, 13 Aug 2019 16:05:38 +0000 (13 09:05 -0700)]
[og9] Use temporary buffers for async host2dev copies
libgomp/
* plugin/plugin-gcn.c (struct copy_data): Add using_src_copy field.
(copy_data): Free temporary buffer if using.
(queue_push_copy): Add using_src_copy parameter.
(GOMP_OFFLOAD_dev2dev, GOMP_OFFLOAD_async_dev2host): Update calls to
queue_push_copy.
(GOMP_OFFLOAD_async_host2dev): Likewise. Allocate temporary buffer and
copy source data to it immediately.
* target.c (gomp_copy_host2dev): Update function comment.
(copy_host2dev_immediate): Remove.
(gomp_map_pointer, gomp_map_vars_internal): Replace calls to
copy_host2dev_immediate with calls to gomp_copy_host2dev.
Julian Brown [Fri, 9 Aug 2019 20:01:33 +0000 (9 13:01 -0700)]
[og9] Wait at end of OpenACC asynchronous kernels regions
gcc/
* omp-oacc-kernels.c (add_wait): New function, split out of...
(add_async_clauses_and_wait): ...here. Call new outlined function.
(decompose_kernels_region_body): Add wait at the end of
explicitly-asynchronous kernels regions.
Julian Brown [Mon, 5 Aug 2019 22:05:58 +0000 (5 15:05 -0700)]
[og9] Use a single worker for OpenACC on AMD GCN
gcc/
* config/gcn/gcn.c (gcn_goacc_validate_dims): Ensure
flag_worker_partitioning is not set.
(TARGET_GOACC_WORKER_PARTITIONING): Remove target hook definition.
* config/gcn/gcn.opt (macc-experimental-workers): Default to off.
libgomp/
* plugin/plugin-gcn.c (gcn_exec): Use 1 for the default number of
workers.
Julian Brown [Wed, 7 Aug 2019 13:40:29 +0000 (7 06:40 -0700)]
[og9] Fix configury for AMD GCN testing
libgomp/
* plugin/configfrag.ac (amdgcn): Set tgt_plugin.
* testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
Add AMD GCN support.
(check_effective_target_openacc_amdgcn_accel_selected): Test
offload_target instead of offload_target_openacc.
* testsuite/libgomp.oacc-c++/c++.exp (amdgcn*): Rename stanza to...
(gcn): ...this. Don't set tagopt redundantly here.
* testsuite/libgomp.oacc-c/c.exp (amdgcn*, gcn): Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp (amdgcn*, gcn): Likewise.
* configure: Regenerated.
Julian Brown [Mon, 5 Aug 2019 22:05:35 +0000 (5 15:05 -0700)]
[og9] Add missing exec_params libgomp plugin entry points
libgomp/
* plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_exec_params,
GOMP_OFFLOAD_openacc_async_exec_params): New functions.
Julian Brown [Wed, 31 Jul 2019 12:38:42 +0000 (31 05:38 -0700)]
[og9] Update parallel-dims.c and serial-dims.c warning line numbering.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Use relative
line numbers for warning.
* testsuite/libgomp.oacc-c-c++-common/serial-dims.c: Likewise.
Julian Brown [Mon, 29 Jul 2019 22:05:35 +0000 (29 15:05 -0700)]
[og9] NVPTX GOMP_OFFLOAD_openacc_async_construct arg fix and gomp_print_* support
libgomp/
* config/nvptx/gomp_print.c (gomp_print_string, gomp_print_integer,
gomp_print_double): New.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_construct): Add
dummy device parameter.
Julian Brown [Fri, 26 Jul 2019 20:51:48 +0000 (26 13:51 -0700)]
[og9] Make OpenACC function-parameter explosion optional
* configure.ac (amdgcn*-*-*): Add target-libffi to noconfigdirs for AMD
GCN.
* configure: Regenerated.
gcc/
* builtin-types.def (BT_FN_VOID_INT_INT_OMPFN_SIZE_PTR_PTR_PTR_VAR):
Remove.
* config/i386/i386.c (ix86_goacc_explode_args): New.
(TARGET_GOACC_EXPLODE_ARGS): Define, using above function.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in: Add TARGET_GOACC_EXPLODE_ARGS hook.
* fortran/types.def (BT_FN_VOID_INT_INT_OMPFN_SIZE_PTR_PTR_PTR_VAR):
Remove.
* omp-builtins.def (GOACC_parallel_keyed_v2): Remove.
* omp-expand.c (expand_omp_target): Use explode_args target hook.
Use GOMP_LAUNCH_ARGS_EXPLODED launch tag.
* omp-low.c (build_receiver_ref, build_sender_ref,
create_omp_child_function, scan_omp_target, lower_omp_target): Use
explode_args target hook.
* target.def (explode_args): New target hook.
* tree-ssa-structalias.c (target.h): Include.
(find_func_aliases_for_builtin_call): Conditionalise disabling of pass
for OpenACC parallel regions based on explode_args target hook. Remove
'params' from BUILT_IN_GOACC_PARALLEL arguments.
(find_func_clobbers): Likewise.
(ipa_pta_execute): Update for removed 'params' argument.
include/
* gomp-constants.h (GOMP_LAUNCH_ARGS_EXPLODED): Define.
libgomp/
* libgomp.map (GOMP_2.0.GOMP_4_BRANCH): Remove GOACC_parallel_keyed_v2.
* libgomp_g.h (GOACC_parallel_keyed_v2): Remove prototype.
* oacc-parallel.c (GOACC_parallel_keyed_internal): Rename to...
(GOACC_parallel_keyed): ...this. Handle GOMP_LAUNCH_ARGS_EXPLODED
launch tag. Remove previous wrapper functions.
(GOACC_parallel_keyed_v2): Remove.
Julian Brown [Fri, 12 Jul 2019 21:40:34 +0000 (12 14:40 -0700)]
[og9] AMD GCN offloading support
gcc/
* config.gcc (amdgcn-*-*): Add default option for gfx906.
* config/gcn/mkoffload.c: New.
* config/gcn/offload.h: New.
libgcc/
* Makefile.in: Allow disabling of emutls.
* config/gcn/gomp_print.c: New.
* config/gcn/reduction.c: New.
* config/gcn/t-amdgcn (LIB2ADD): Add gomp_print.c and reduction.c.
Disable emutls.c.
* config/gcn/t-gcn-hsa: New.
libgomp/
* Makefile.am (libgomp_la_SOURCES): Add gomp_print.c.
* Makefile.in: Regenerate.
* affinity-fmt.c: Rename calls to gomp_write_string from
gomp_print_string.
* config.h.in (PLUGIN_GCN): Add #undef.
* config/nvptx/libgomp-plugin.c: Rename to...
* config/accel/libgomp-plugin.c: ...this.
* config/nvptx/lock.c: Rename to...
* config/accel/lock.c: ...this.
* config/nvptx/mutex.c: Rename to...
* config/accel/mutex.c: ...this.
* config/nvptx/mutex.h: Rename to...
* config/accel/mutex.h: ...this.
* config/nvptx/oacc-async.c: Rename to...
* config/accel/oacc-async.c: ...this.
* config/nvptx/oacc-cuda.c: Rename to...
* config/accel/oacc-cuda.c: ...this.
* config/nvptx/oacc-host.c: Rename to...
* config/accel/oacc-host.c: ...this.
* config/nvptx/oacc-init.c: Rename to...
* config/accel/oacc-init.c: ...this.
* config/nvptx/oacc-mem.c: Rename to...
* config/accel/oacc-mem.c: ...this.
* config/nvptx/oacc-plugin.c: Rename to...
* config/accel/oacc-plugin.c: ...this.
* config/nvptx/omp-lock.h: Rename to...
* config/accel/omp-lock.h: ...this.
* config/nvptx/openacc.f90: Rename to...
* config/accel/openacc.f90: ...this. Add acc_device_hsa and
acc_device_gcn.
* config/nvptx/pool.h: Rename to...
* config/accel/pool.h: ...this.
* config/nvptx/proc.c: Rename to...
* config/accel/proc.c: ...this. Add omp_get_num_procs alias.
* config/nvptx/ptrlock.c: Rename to...
* config/accel/ptrlock.c: ...this.
* config/nvptx/ptrlock.h: Rename to...
* config/accel/ptrlock.h: ...this.
* config/nvptx/sem.c: Rename to...
* config/accel/sem.c: ...this.
* config/nvptx/sem.h: Rename to...
* config/accel/sem.h: ...this.
* config/nvptx/thread-stacksize.h: Rename to...
* config/accel/thread-stacksize.h: ...this.
* config/gcn/affinity-fmt.c: New.
* config/gcn/bar.c: New.
* config/gcn/bar.h: New.
* config/gcn/doacross.h: New.
* config/gcn/gomp_print.c: New.
* config/gcn/icv-device.c: New.
* config/gcn/simple-bar.h: New.
* config/gcn/target.c: New.
* config/gcn/task.c: New.
* config/gcn/team.c: New.
* config/gcn/time.c: New.
* config/linux/gomp_print.c: New.
* configure.ac (amdgcn*-*-*): Disable pthreads.
* configure: Regenerated.
* configure.tgt (nvptx*-*-*): Add 'accel' config_path.
(amdgcn*-*-*): Set config_path.
* fortran.c (omp_display_affinity_): Rename calls to gomp_write_string
from gomp_print_string.
* libgomp-plugin.h (enum offload_target_type): Add
OFFLOAD_TARGET_TYPE_GCN.
(GOMP_OFFLOAD_openacc_async_construct): Change parameter type to int.
* libgomp.h (gcn_thrs, set_gcn_thrs, gomp_thread): Add for __AMDGCN__.
(gomp_print_string): Rename to...
(gomp_write_string): ...this.
* libgomp.map (GOMP_4.5): Add gomp_rpint_string, gomp_print_integer,
gomp_print_double.
* oacc-async.c (lookup_goacc_asyncqueue): Pass target_id to async queue
construct function.
* oacc-host.c (host_openacc_async_construct): Add dummy device
parameter.
* oacc-init.c (name_of_acc_device_t): Add acc_device_gcn.
* oacc-int.h (goacc_thread): Add dummy implementation for __AMDGCN__.
* oacc-parallel.c (GOACC_enter_exit_data): Support acc_async_noval and
zero-length array sections.
* omp.h.in (gomp_print_string, gomp_print_integer, gomp_print_double):
Add prototypes.
* omp_lib.f90.in (gomp_print_string, gomp_print_integer,
gomp_print_double): Add interfaces.
* openacc.f90 (openacc_kinds): Add acc_device_gcn. Bump
acc_device_current code.
* openacc.h (acc_device_t): Add acc_device_gcn, bump acc_device_current
code.
* openacc_lib.h (acc_device_hsa, acc_device_gcn): Add.
* plugin/Makefrag.am (PLUGIN_GCN): Support building GCN plugin.
* plugin/configfrag.am (PLUGIN_GCN, PLUGIN_GCN_CPPFLAGS,
PLUGIN_GCN_LDFLAGS, PLUGIN_GCN_LIBS): Add. Add suport for GCN plugin.
* plugin/plugin-gcn.c: New.
* target.c (stdio.h): Include unconditionally.
(gomp_copy_host2dev): Add function comment.
(copy_host2dev_immediate): New function.
(gomp_map_pointer, gomp_map_vars_internal): Use
copy_host2dev_immediate where appropriate.
(offload_target_to_plugin_name): Support gcn.
* team.c (gomp_free_pool_helper): Support gcn.
* testsuite/Makefile.in: Regenerated.
* testsuite/lib/libgomp.exp
(check_effective_target_openacc_amdgcn_accel_present): New.
(check_effective_target_openacc_amdgcn_accel_selected): New.
* testsuite/libgomp.c/c.exp (generate_tests, test_lists,
generated_tests): New.
(tests): Add generated tests.
* testsuite/libgomp.c/for-1.h: New.
* testsuite/libgomp.c/for-2.h: New.
* testsuite/libgomp.c/for-3.h: New.
* testsuite/libgomp.c/for-3.list: New.
* testsuite/libgomp.c/for-5.c: New.
* testsuite/libgomp.c/for-5.list: New.
* testsuite/libgomp.c/for-6.c: New.
* testsuite/libgomp.c/for-6.list: New.
* testsuite/libgomp.c/target-print-1.c: New.
* testsuite/libgomp.fortran/target-print-1.f90: New.
* testsuite/libgomp.oacc-c++/c++.exp (amdgcn*): Add support for AMD GCN.
* testsuite/libgomp.oacc-c-c++-common/atomic_capture-2.c: Adjust for
portability.
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Skip unsuitable
test for AMD GCN.
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Adjust for
portability.
* testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/serial-dims.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/private-variables-2.c: New.
* testsuite/libgomp.oacc-c-c++-common/tile-1.c: Skip for AMD GCN.
* testsuite/libgomp.oacc-c/c.exp (amdgcn*): Add support for AMD GCN.
* testsuite/libgomp.oacc-c/offload-target-1.c: Add AMD GCN support.
* testsuite/libgomp.oacc-c/print-1.c: New.
* testsuite/libgomp.oacc-fortran/fortran.exp (amdgcn*): Add AMD GCN
support.
* testsuite/libgomp.oacc-fortran/atomic_capture-1.f90: Adjust for
portability.
* testsuite/libgomp.oacc-fortran/collapse-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/error_stop-1.f: Support AMD GCN.
* testsuite/libgomp.oacc-fortran/error_stop-2.f: Support AMD GCN.
* testsuite/libgomp.oacc-fortran/error_stop-3.f: Support AMD GCN.
* testsuite/libgomp.oacc-fortran/print-1.f90: New.
Julian Brown [Tue, 23 Jul 2019 17:20:23 +0000 (23 10:20 -0700)]
[og9] Enable full GFortran library for AMD GCN
2019-06-25 Kwok Cheung Yeung <kcy@codesourcery.com>
Andrew Stubbs <ams@codesourcery.com>
Backport from mainline:
libgfortran/
* configure: Regenerate.
* configure.ac (LIBGFOR_MINIMAL): Do not use on AMD GCN.
Julian Brown [Tue, 23 Jul 2019 18:03:13 +0000 (23 11:03 -0700)]
[og9] Stub implementation of unwinding for AMD GCN
2019-06-25 Andrew Stubbs <ams@codesourcery.com>
Backport from mainline:
libgcc/
* config/gcn/t-amdgcn (LIB2ADD): Add unwind-gcn.c.
* config/gcn/unwind-gcn.c: New file.
Julian Brown [Tue, 23 Jul 2019 18:00:51 +0000 (23 11:00 -0700)]
[og9] Create GCN-specific gthreads
2019-06-25 Kwok Cheung Yeung <kcy@codesourcery.com>
Andrew Stubbs <ams@codesourcery.com>
Backport from mainline:
gcc/
* config.gcc (thread_file): Set to gcn for AMD GCN.
* config/gcn/gcn.c (gcn_emutls_var_init): New function.
(TARGET_EMUTLS_VAR_INIT): New hook.
config/
* gthr.m4 (GCC_AC_THREAD_HEADER): Add case for gcn.
libgcc/
* configure: Regenerate.
* config/gcn/gthr-gcn.h: New.
Julian Brown [Tue, 23 Jul 2019 16:39:22 +0000 (23 09:39 -0700)]
[og9] Add support for constructors and destructors on GCN
2019-05-22 Kwok Cheung Yeung <kcy@codesourcery.com>
Andrew Stubbs <amd@codesourcery.com>
Backport from mainline:
* config.gcc (gcc_cv_initfini_array): Set for AMD GCN.
* config/gcn/gcn-run.c (init_array_kernel, fini_array_kernel): New.
(kernel): Rename to...
(main_kernel): ... this.
(load_image): Load _init_array and _fini_array kernels.
(run): Add argument for kernel to run.
(main): Run init_array_kernel before main_kernel, and
fini_array_kernel after.
* config/gcn/gcn.c (gcn_handle_amdgpu_hsa_kernel_attribute): Allow
amdgpu_hsa_kernel attribute on functions.
(gcn_disable_constructors): Delete.
(TARGET_ASM_CONSTRUCTOR, TARGET_ASM_DESTRUCTOR): Delete.
* config/gcn/crt0.c (size_t): Define.
(_init_array, _fini_array): New.
(__preinit_array_start, __preinit_array_end,
__init_array_start, __init_array_end,
__fini_array_start, __fini_array_end): Declare weak references.
Kwok Cheung Yeung [Fri, 21 Jun 2019 17:40:38 +0000 (21 10:40 -0700)]
Add changes to profiling interface from OG8 branch
This bundles up the parts of the profiling code from the OG8 branch that were
not included in the upstream patch.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Update.
libgomp/
* oacc-init.c (get_property_any): Add profiling code.
libgomp/
* Makefile.am (libgomp_la_SOURCES): Add
oacc-profiling-acc_register_library.c.
* Makefile.in: Regenerate.
* libgomp.texi: Remove paragraph about acc_register_library.
* oacc-parallel.c (GOACC_parallel_keyed_internal): Set device_api for
profiling.
* oacc-profiling-acc_register_library.c: New file.
* oacc-profiling.c (goacc_profiling_initialize): Call
acc_register_library. Avoid duplicate registration.
(acc_register_library): Remove.
* config/nvptx/oacc-profiling-acc_register_library.c:
New empty file.
* config/nvptx/oacc-profiling.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-dispatch-1.c: Remove
call to acc_register_library.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-valid_bytes-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-version-1.c: Likewise.
tschwinge [Fri, 17 May 2019 19:13:36 +0000 (17 19:13 +0000)]
OpenACC Profiling Interface (incomplete)
libgomp/
* acc_prof.h: New file.
* oacc-profiling.c: Likewise.
* Makefile.am (nodist_libsubinclude_HEADERS, libgomp_la_SOURCES):
Add these, respectively.
* Makefile.in: Regenerate.
* env.c (initialize_env): Call goacc_profiling_initialize.
* oacc-plugin.c (GOMP_PLUGIN_goacc_thread)
(GOMP_PLUGIN_goacc_profiling_dispatch): New functions.
* oacc-plugin.h (GOMP_PLUGIN_goacc_thread)
(GOMP_PLUGIN_goacc_profiling_dispatch): Declare.
* libgomp.map (OACC_2.5.1): Add acc_prof_lookup,
acc_prof_register, acc_prof_unregister, and acc_register_library.
(GOMP_PLUGIN_1.3): Add GOMP_PLUGIN_goacc_profiling_dispatch, and
GOMP_PLUGIN_goacc_thread.
* oacc-int.h (struct goacc_thread): Add prof_info, api_info,
prof_callbacks_enabled members.
(goacc_prof_enabled, goacc_profiling_initialize)
(_goacc_profiling_dispatch_p, _goacc_profiling_setup_p)
(goacc_profiling_dispatch): Declare.
(GOACC_PROF_ENABLED, GOACC_PROFILING_DISPATCH_P)
(GOACC_PROFILING_SETUP_P): Define.
* oacc-async.c (acc_async_test, acc_async_test_all, acc_wait)
(acc_wait_async, acc_wait_all, acc_wait_all_async): Update for
OpenACC Profiling Interface.
* oacc-cuda.c (acc_get_current_cuda_device)
(acc_get_current_cuda_context, acc_get_cuda_stream)
(acc_set_cuda_stream): Likewise.
* oacc-init.c (acc_init_1, goacc_attach_host_thread_to_device)
(acc_init, acc_set_device_type, acc_get_device_type)
(acc_get_device_num, goacc_lazy_initialize): Likewise.
* oacc-mem.c (acc_malloc, acc_free, memcpy_tofrom_device)
(acc_deviceptr, acc_hostptr, acc_is_present, acc_map_data)
(acc_unmap_data, present_create_copy, delete_copyout)
(update_dev_host): Likewise.
* oacc-parallel.c (GOACC_parallel_keyed, GOACC_data_start)
(GOACC_data_end, GOACC_enter_exit_data, GOACC_update, GOACC_wait):
Likewise.
* plugin/plugin-nvptx.c (nvptx_exec, nvptx_alloc, nvptx_free)
(GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec):
Likewise.
* libgomp.texi: Update.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-dispatch-1.c: New
file.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-valid_bytes-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-version-1.c:
Likewise.
Chung-Lin Tang [Tue, 16 Jul 2019 14:56:41 +0000 (16 07:56 -0700)]
Commit of https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00336.html
2019-07-04 Chung-Lin Tang <cltang@codesourcery.com>
libatomic/
PR other/79543
* acinclude.m4 (LIBAT_CHECK_LINKER_FEATURES): Fix GNU ld --version
scanning to conform to the GNU Coding Standards.
* configure: Regenerate.
libffi/
PR other/79543
* acinclude.m4 (LIBAT_CHECK_LINKER_FEATURES): Fix GNU ld --version
scanning to conform to the GNU Coding Standards.
* configure: Regenerate.
libgomp/
PR other/79543
* acinclude.m4 (LIBGOMP_CHECK_LINKER_FEATURES): Fix GNU ld --version
scanning to conform to the GNU Coding Standards.
* configure: Regenerate.
libitm/
PR other/79543
* acinclude.m4 (LIBITM_CHECK_LINKER_FEATURES): Fix GNU ld --version
scanning to conform to the GNU Coding Standards.
* configure: Regenerate.
libstdc++-v3/
PR other/79543
* acinclude.m4 (GLIBCXX_CHECK_LINKER_FEATURES): Fix GNU ld --version
scanning to conform to the GNU Coding Standards.
* configure: Regenerate.
Cesar Philippidis [Sun, 7 Jul 2019 18:25:51 +0000 (7 11:25 -0700)]
Allow the accelerator to have more offloaded functions than the host
libgomp/
* target.c (gomp_load_image_to_device): Allow the accelerator to
possess more offloaded functions than the host.
Julian Brown [Fri, 5 Jul 2019 01:14:41 +0000 (4 18:14 -0700)]
Assumed-size arrays with non-lexical data mappings
gcc/
* gimplify.c (gimplify_adjust_omp_clauses_1): Raise error for
assumed-size arrays in map clauses for Fortran/OpenMP.
* omp-low.c (lower_omp_target): Set the size of assumed-size Fortran
arrays to one to allow use of data already mapped on the offload device.
gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Change clauses mapping
assumed-size arrays to use the GOMP_MAP_FORCE_PRESENT map type.
Julian Brown [Wed, 20 Feb 2019 13:21:15 +0000 (20 05:21 -0800)]
Support Fortran 2003 class pointers in OpenACC
gcc/
* gimplify.c (insert_struct_comp_map): Handle GOMP_MAP_ATTACH_DETACH.
(gimplify_scan_omp_clauses): Separate out handling of OACC_ENTER_DATA
and OACC_EXIT_DATA. Remove GOMP_MAP_POINTER and GOMP_MAP_TO_PSET
mappings, apart from those following GOMP_MAP_DECLARE_{,DE}ALLOCATE.
Handle GOMP_MAP_ATTACH_DETACH.
* tree-pretty-print.c (dump_omp_clause): Support GOMP_MAP_ATTACH_DETACH.
Print "bias" not "len" for attach/detach clause types.
include/
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_ATTACH_DETACH.
gcc/c/
* c-typeck.c (handle_omp_array_sections): Use GOMP_MAP_ATTACH_DETACH
for OpenACC attach/detach operations.
gcc/cp/
* semantics.c (handle_omp_array_sections): Likewise.
(finish_omp_clauses): Handle GOMP_MAP_ATTACH_DETACH.
gcc/fortran/
* openmp.c (resolve_oacc_data_clauses): Allow polymorphic allocatable
variables.
* trans-expr.c (gfc_conv_component_ref,
conv_parent_component_reference): Make global.
(gfc_auto_dereference_var): New function, broken out of...
(gfc_conv_variable): ...here. Call outlined function instead.
* trans-openmp.c (gfc_trans_omp_array_section): New function, broken out
of...
(gfc_trans_omp_clauses): ...here. Separate out OpenACC derived
type/polymorphic class pointer handling. Call above outlined function.
* trans.h (gfc_conv_component_ref, conv_parent_component_references,
gfc_auto_dereference_var): Add prototypes.
gcc/testsuite/
* c-c++-common/goacc/mdc-1.c: Update clause matching patterns.
libgomp/
* oacc-parallel.c (GOACC_enter_exit_data): Fix optional arguments for
changes to clause stripping in enter data/exit data directives.
* testsuite/libgomp.oacc-fortran/class-ptr-param.f95: New test.
* testsuite/libgomp.oacc-fortran/classtypes-1.f95: New test.
* testsuite/libgomp.oacc-fortran/classtypes-2.f95: New test.
* testsuite/libgomp.oacc-fortran/derivedtype-1.f95: New test.
* testsuite/libgomp.oacc-fortran/derivedtype-2.f95: New test.
* testsuite/libgomp.oacc-fortran/multidim-slice.f95: New test.
jakub [Mon, 8 Jul 2019 22:08:27 +0000 (8 22:08 +0000)]
Fix ICE in cp_omp_mappable_type_1
2019-07-09 Andrew Stubbs <ams@codesourcery.com>
Backport from mainline
2019-07-08 Jakub Jelinek <jakub@redhat.com>
PR c++/91110
* decl2.c (cp_omp_mappable_type_1): Don't emit any note for
error_mark_node type.
* g++.dg/gomp/pr91110.C: New test.
ams [Thu, 4 Jul 2019 11:43:47 +0000 (4 11:43 +0000)]
Improve OpenMP map diagnostics.
2019-07-04 Andrew Stubbs <ams@codesourcery.com>
Backport from mainline:
2019-07-04 Andrew Stubbs <ams@codesourcery.com>
gcc/cp/
* cp-tree.h (cp_omp_emit_unmappable_type_notes): New prototype.
* decl.c (cp_finish_decl): Call cp_omp_emit_unmappable_type_notes.
* decl2.c (cp_omp_mappable_type): Move contents to ...
(cp_omp_mappable_type_1): ... here and add note output.
(cp_omp_emit_unmappable_type_notes): New function.
* semantics.c (finish_omp_clauses): Call
cp_omp_emit_unmappable_type_notes in four places.
gcc/testsuite/
* g++.dg/gomp/unmappable-1.C: New file.
Kwok Cheung Yeung [Tue, 11 Jun 2019 17:24:44 +0000 (11 10:24 -0700)]
Merge commit 'gcc-9_1_0-release^' into openacc-gcc-9-branch
Merge changes from between the 'gcc-9-branch' branch point and the
'gcc-9_1_0-release' tag.
Julian Brown [Tue, 28 May 2019 15:42:10 +0000 (28 08:42 -0700)]
Apply gangprivate attribute to innermost decl
...and fix parallelism-level calculation when applying the attribute.
gcc/
* omp-low.c (mark_oacc_gangprivate): Add CTX parameter. Use to look up
correct decl to add attribute to.
(lower_omp_for): Move "oacc gangprivate" processing from here...
(process_oacc_gangprivate_1): ...to here. New function.
(lower_omp_target): Update call to mark_oacc_gangprivate.
(execute_lower_omp): Call process_oacc_gangprivate_1 for each OMP
context.
libgomp/
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: New test.
Kwok Cheung Yeung [Fri, 31 May 2019 19:25:03 +0000 (31 12:25 -0700)]
Fix expected messages in goacc tests
The expected messages in the OpenACC kernel-related tests should be prefixed
with 'optimized:' rather than 'note:'.
2019-05-31 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/testsuite/
* c-c++-common/goacc/kernels-decompose-1.c: Change 'note:' to
'optimized:'. Fix typo.
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c:
Change 'note:' to 'optimized:'.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c:
Likewise.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c:
Likewise.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.
Andrew Jenner [Wed, 30 Jan 2019 17:38:46 +0000 (30 09:38 -0800)]
Link libquadmath in Fortran libgomp tests
When invoking gcc to compile fortran code, fortran.exp is currently adding the
options -lgfortran -foffload=-lgfortran to the gcc command line. libgfortran
statically links to libquadmath and the gfortran driver invokes the linker
with -lquadmath as well as -lgfortran so fortran.exp should do so too.
libgomp/
* testsuite/libgomp.fortan/fortran.exp (lang_link_flags): Add
-lquadmath.
* testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Add
-lquadmath.
Kwok Cheung Yeung [Thu, 30 May 2019 18:58:05 +0000 (30 11:58 -0700)]
Fix missing gstdint.h error
libgomp/
* libgomp_g.h: Include stdint.h instead of gstdint.h.
Kwok Cheung Yeung [Thu, 30 May 2019 18:57:00 +0000 (30 11:57 -0700)]
Fix for firstprivate-int.f90 test failures
Do not propogate the range when converting from a reference to an integral
type.
gcc/
* tree-vrp.c (extract_range_from_unary_expr): Set a varying range
when a reference is converted to an integral type.
Julian Brown [Tue, 21 May 2019 00:27:38 +0000 (20 17:27 -0700)]
Fix lexically-nested data mappings for no_alloc or optional arguments
gcc/
* gimplify.c (gimplify_adjust_omp_clauses_1): Support implied no_alloc
and optional arguments based on mappings in enclosing data regions.
Julian Brown [Mon, 20 May 2019 23:31:41 +0000 (20 16:31 -0700)]
Fix warning syntax and typos in two libgomp tests
libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Expect
"optimized:" not "note:" in warnings.
* testsuite/libgomp.oacc-c-c++-common/serial-dims.c: Fix typos in
warnings.
Julian Brown [Sun, 19 May 2019 17:42:20 +0000 (19 10:42 -0700)]
Fix references declared in lexically-enclosing OpenACC data region
gcc/fortran/
* trans-openmp.c (gfc_omp_finish_clause): Guard addition of clauses for
pointers with DECL_P.
gcc/
* gimplify.c (oacc_array_mapping_info): Add REF field.
(gimplify_scan_omp_clauses): Initialise above field for data blocks
passed by reference.
(gomp_oacc_needs_data_present): Handle references.
(gimplify_adjust_omp_clauses_1): Handle references and optional
arguments for variables declared in lexically-enclosing OpenACC data
region.
Julian Brown [Thu, 16 May 2019 12:47:16 +0000 (16 05:47 -0700)]
Add kernels for-index reuse testcase.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-for-index-reuse-1.c: New
test.
Julian Brown [Thu, 16 May 2019 12:46:34 +0000 (16 05:46 -0700)]
Initialise KEY and OFFSET fields when if_present test fails.
libgomp/
* target.c (gomp_map_vars_async): Initialise KEY and OFFSET fields in
not-present case.
Julian Brown [Thu, 16 May 2019 12:45:35 +0000 (16 05:45 -0700)]
Avoid introducing 'create' mapping clauses for loop index variables in kernels regions
gcc/
* omp-oacc-kernels.c (find_omp_for_index_vars_1,
find_omp_for_index_vars): New functions.
(maybe_build_inner_data_region): Add IDX_VARS argument. Don't add
CREATE mapping clauses for loop index variables. Set TREE_ADDRESSABLE
flag on newly-mapped declarations as a side effect.
(decompose_kernels_region_body): Call find_omp_for_index_vars. Don't
create PRESENT clause for loop index variables. Pass index variable
set to maybe_build_inner_data_region.
Julian Brown [Wed, 9 Jan 2019 11:41:04 +0000 (9 03:41 -0800)]
Update OpenACC version to 2.6
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update _OPENACC define to 201711.
gcc/doc/
* invoke.texi: Update mention of OpenACC version to 2.6.
gcc/fortran/
* cpp.c (cpp_define_builtins): Update _OPENACC define to 201711.
* gfortran.texi: Update mentions of OpenACC version to 2.6.
* intrinsic.texi: Likewise.
gcc/testsuite/
* c-c++-common/cpp/openacc-define-3.c: Update expected value for
_OPENACC define.
* gfortran.dg/openacc-define-3.f90: Likewise.
libgomp/
* libgomp.texi: Update mentions of OpenACC version to 2.6. Update
section numbers to match version 2.6 of the spec.
* openacc.f90 (openacc_version): Update to 201711.
* openacc_lib.h (openacc_version): Update to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Update expected
openacc_version to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
Kwok Cheung Yeung [Fri, 3 May 2019 13:14:35 +0000 (3 06:14 -0700)]
Fix ICE when optional arguments are used in OpenACC directives
2019-05-03 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* gimplify.c (gomp_oacc_needs_data_present): Return NULL if decl is a
Fortran optional argument.
Thomas Schwinge [Tue, 8 Jan 2019 14:21:35 +0000 (8 15:21 +0100)]
Add OpenACC 2.6 `acc_get_property' support: restore Intel MIC offloading
The "OpenACC 2.6 `acc_get_property' support" changes regressed the relevant
libgomp OpenMP execution test cases to no longer consider Intel MIC offloading
because of:
libgomp: while loading libgomp-plugin-intelmic.so.1: [...]/libgomp-plugin-intelmic.so.1: undefined symbol: GOMP_OFFLOAD_get_property
liboffloadmic/
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property):
New function.
Thomas Schwinge [Fri, 1 Feb 2019 17:12:05 +0000 (1 18:12 +0100)]
Adjust parallelism of loops in gang-single parts of OpenACC kernels regions: "struct adjust_nested_loop_clauses_wi_info"
The current code apparently is too freaky at least for for GCC 4.6:
[...]/gcc/omp-oacc-kernels.c: In function 'tree_node* transform_kernels_loop_clauses(gimple*, tree, tree, tree, tree)':
[...]/gcc/omp-oacc-kernels.c:584:10: error: expected identifier before numeric constant
[...]/gcc/omp-oacc-kernels.c: In lambda function:
[...]/gcc/omp-oacc-kernels.c:584:25: error: expected '{' before '=' token
[...]/gcc/omp-oacc-kernels.c: In function 'tree_node* transform_kernels_loop_clauses(gimple*, tree, tree, tree, tree)':
[...]/gcc/omp-oacc-kernels.c:584:25: warning: lambda expressions only available with -std=c++0x or -std=gnu++0x [enabled by default]
[...]/gcc/omp-oacc-kernels.c:584:28: error: no match for 'operator=' in '{} = & loop_gang_clause'
[...]
gcc/
* omp-oacc-kernels.c (struct adjust_nested_loop_clauses_wi_info): New.
(adjust_nested_loop_clauses, transform_kernels_loop_clauses): Use it.
Thomas Schwinge [Wed, 23 Jan 2019 10:40:08 +0000 (23 02:40 -0800)]
Make new OpenACC kernels conversion the default; adjust and add tests
gcc/c-family/
* c.opt (fopenacc-kernels): Default to "split".
gcc/fortran/
* lang.opt (fopenacc-kernels): Default to "split".
gcc/
* doc/invoke.texi (-fopenacc-kernels): Update.
gcc/testsuite/
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c:
New file.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c:
Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Likewise.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c:
Likewise.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c:
Likewise.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loop-auto.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c:
Likewise.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Likewise.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Update.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/goacc/dtype-1.c: Likewise.
* c-c++-common/goacc/if-clause-2.c: Likewise.
* c-c++-common/goacc/kernels-conversion.c: Likewise.
* c-c++-common/goacc/kernels-decompose-1.c: Likewise.
* c-c++-common/goacc/loop-2-kernels.c: Likewise.
* c-c++-common/goacc/note-parallelism.c: Likewise.
* c-c++-common/goacc/routine-1.c: Likewise.
* c-c++-common/goacc/uninit-dim-clause.c: Likewise.
* gfortran.dg/goacc/dtype-1.f95: Likewise.
* gfortran.dg/goacc/kernels-conversion.f95: Likewise.
* gfortran.dg/goacc/kernels-decompose-1.f95: Likewise.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
Update.
* testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Likewise.
* testsuite/libgomp.oacc-fortran/avoid-offloading-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/avoid-offloading-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/avoid-offloading-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/initialize_kernels_loops.f90:
Likewise.
Thomas Schwinge [Thu, 24 Jan 2019 16:40:03 +0000 (24 08:40 -0800)]
New OpenACC kernels region decompose algorithm
Previously, OpenACC kernels region bodies were decomposed into a sequence of
alternating gang-single and gang-parallel "parallel" regions. The new
algorithm in this patch introduces a third possibility: Loops that look like
they might benefit from the parloops pass are converted into old "kernels"
regions, exposing them to the parloops pass later on. This has the benefit
that loops that cannot be parallelized are not offloaded to the GPU.
gcc/
* omp-oacc-kernels.c (adjust_region_code_walk_stmt_fn)
(adjust_region_code): New functions.
(make_loops_gang_single): Update.
(make_gang_single_region): Rename to...
(make_region_seq): ... this, and update.
(make_gang_parallel_loop_region): Rename to...
(make_region_loop_nest): ... this, and update.
(is_unconditional_oacc_for_loop): Remove stmt parameter and check.
(decompose_kernels_region_body): Update.
gcc/testsuite/
* c-c++-common/goacc/kernels-conversion.c: Adjust test.
* gfortran.dg/goacc/kernels-conversion.f95: Likewise.
* c-c++-common/goacc/kernels-decompose-1.c: New file.
* gfortran.dg/goacc/kernels-decompose-1.f95: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: New
file.
Gergö Barany [Mon, 21 Jan 2019 20:50:14 +0000 (21 12:50 -0800)]
Launch kernels asynchronously in OpenACC kernels regions
Kernels regions are decomposed into one or more smaller regions that are to
be executed in sequence. With this patch, all of these regions are launched
asynchronously, and a wait directive is added after them. This means that
the host only waits once for the kernels to complete, not once per kernel.
If the original kernels region was marked async, that asynchronous behavior
is preserved, and no wait is added.
gcc/
* omp-oacc-kernels.c (add_async_clauses_and_wait): New function...
(decompose_kernels_region_body): ... called from here.
gcc/testsuite/
* c-c++-common/goacc/kernels-conversion.c: Test automatically generated
async clauses.
* gfortran.dg/goacc/kernels-conversion.f95: Likewise.
Gergö Barany [Thu, 24 Jan 2019 06:11:11 +0000 (23 22:11 -0800)]
Adjust parallelism of loops in gang-single parts of OpenACC kernels regions
Loops in gang-single parts of kernels regions cannot be executed in
gang-redundant mode. If the user specified gang clauses on such loops, emit
an error and remove these clauses. Adjust automatic partitioning to exclude
gang partitioning in gang-single regions.
gcc/
* omp-oacc-kernels.c (add_parent_or_loop_num_clause): New function.
(adjust_nested_loop_clauses): Likewise.
(transform_kernels_loop_clauses, make_gang_parallel_loop_region):
Add worker and vector clause parameters, emit error on illegal
nesting.
(visit_loops_in_gang_single_region): Emit warning on conditionally
executed code with a gang clause.
(make_loops_gang_single): New function.
(decompose_kernels_region_body): Separate out gang/worker/vector clauses
for separate handling; add call to make_loops_gang_single.
* omp-offload.c (oacc_loop_auto_partitions): Add and propagate
is_oacc_gang_single parameter.
(oacc_loop_partition): Likewise.
(execute_oacc_device_lower): Adjust call to oacc_loop_partition.
Gergö Barany [Wed, 23 Jan 2019 22:32:57 +0000 (23 14:32 -0800)]
Handle conditional execution of loops in OpenACC kernels regions
Any OpenACC loop controlled by an if statement or a non-OpenACC loop must be
executed in a gang-single region. Detecting such loops is not trivial as
OpenACC kernels expansion is done on GIMPLE but before computation of the
control flow graph. This patch adds an auxiliary analysis for determining
whether a statement is inside a conditionally executed region (relative to
the kernels region's entry).
gcc/
* omp-oacc-kernels.c (control_flow_regions): New class.
(control_flow_regions::control_flow_regions): New constructor.
(control_flow_regions::is_unconditional_oacc_for_loop): New method.
(control_flow_regions::find_rep): Likewise.
(control_flow_regions::union_reps): Likewise.
(control_flow_regions::compute_regions): Likewise.
(decompose_kernels_region_body): Use test for conditional execution.
gcc/testsuite/
* c-c++-common/goacc/kernels-conversion.c: Add test for conditionally
executed code.
* gfortran.dg/goacc/kernels-conversion.f95: Likewise.
Gergö Barany [Mon, 21 Jan 2019 15:16:06 +0000 (21 07:16 -0800)]
Turn OpenACC kernels regions into a sequence of parallel regions
This patch decomposes each OpenACC kernels region into a sequence of
parallel regions. Each OpenACC loop nest turns into its own region; any code
between such loop nests is gathered up into a region as well. The loop
regions can be distributed across gangs if the original kernels region had a
num_gangs clause, while the other regions are executed in "gang-single"
mode. The implied default "auto" clause on kernels loops is made explicit
unless there is a conflicting clause.
gcc/
* omp-oacc-kernels.c (top_level_omp_for_in_stmt): New function.
(make_gang_single_region): Likewise.
(transform_kernels_loop_clauses, make_gang_parallel_loop_region):
Likewise.
(flatten_binds): Likewise.
(make_data_region_try_statement): Likewise.
(maybe_build_inner_data_region): Likewise.
(decompose_kernels_region_body): Likewise.
(transform_kernels_region): Delegate to decompose_kernels_region_body
and make_data_region_try_statement.
gcc/testsuite/
* c-c++-common/goacc/kernels-conversion.c: Test for a gang-single
region.
* gfortran.dg/goacc/kernels-conversion.f95: Likewise.
Gergö Barany [Mon, 21 Jan 2019 13:28:20 +0000 (21 05:28 -0800)]
Separate OpenACC kernels regions in data and parallel parts
This is the first in a series of patches that completely rework the handling
of the OpenACC "kernels" directive. In the future, kernels regions will be
transformed into data regions containing a sequence of serial and parallel
offloaded regions. This first patch sets up a new pass that is responsible
for this transformation, and in a first step constructs the new data region
containing a parallel region with the original kernels region's body.
gcc/
* Makefile.in: Add...
* omp-oacc-kernels.c: ... this new file for the kernels conversion
pass.
* flag-types.h (enum openacc_kernels): Add "split" style. Adjust
all users.
* doc/invoke.texi (-fopenacc-kernels): Update.
* passes.def: Add pass_convert_oacc_kernels to pipeline.
* tree-pass.h (make_pass_convert_oacc_kernels): Add declaration.
gcc/testsuite/
* c-c++-common/goacc/kernels-conversion.c: New test.
* gfortran.dg/goacc/kernels-conversion.f95: Likewise.
* c-c++-common/goacc/if-clause-2.c: Update.
* gfortran.dg/goacc/kernels-tree.f95: Likewise.
Thomas Schwinge [Wed, 23 Jan 2019 14:56:52 +0000 (23 06:56 -0800)]
Add OpenACC target kinds for decomposed kernels regions
This patch is in preparation for changes that will cut up OpenACC kernels
regions into individual parts. For the new sub-regions that will be
generated, this adds the following new kinds of OpenACC regions for internal
use:
- GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED for parts of kernels
regions to be executed in gang-redundant mode
- GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE for parts of kernels
regions to be executed in gang-single mode
- GF_OMP_TARGET_KIND_OACC_DATA_KERNELS for data regions generated around the
body of a kernels region
gcc/
* gimple.h (enum gf_mask): Add new target kinds
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_PARALLELIZED,
GF_OMP_TARGET_KIND_OACC_PARALLEL_KERNELS_GANG_SINGLE, and
GF_OMP_TARGET_KIND_OACC_DATA_KERNELS.
(is_gimple_omp_oacc): Handle new target kinds.
(is_gimple_omp_offloaded): Likewise.
* gimple-pretty-print.c (dump_gimple_omp_target): Likewise.
* omp-expand.c (expand_omp_target): Likewise.
(build_omp_regions_1): Likewise.
(omp_make_gimple_edges): Likewise.
* omp-low.c (is_oacc_parallel_or_serial): Likewise.
(was_originally_oacc_kernels): New function.
(scan_omp_for): Update check for illegal nesting.
(check_omp_nesting_restrictions): Handle new target kinds.
(lower_oacc_reductions): Likewise.
(lower_omp_target): Likewise.
* omp-offload.c (execute_oacc_device_lower): Likewise.
Thomas Schwinge [Wed, 30 Jan 2019 09:32:10 +0000 (30 10:32 +0100)]
Use "-fopenacc-kernels=parloops" to document "parloops" test cases
gcc/
* flag-types.h (enum openacc_kernels): New type.
gcc/c-family/
* c.opt (fopenacc-kernels): New flag.
gcc/fortran/
* lang.opt (fopenacc-kernels): New flag.
gcc/testsuite/
* c-c++-common/goacc/kernels-1.c: Add
"-fopenacc-kernels=parloops".
* c-c++-common/goacc/kernels-acc-loop-reduction.c: Likewise.
* c-c++-common/goacc/kernels-acc-loop-smaller-equal.c: Likewise.
* c-c++-common/goacc/kernels-alias-2.c: Likewise.
* c-c++-common/goacc/kernels-alias-3.c: Likewise.
* c-c++-common/goacc/kernels-alias-4.c: Likewise.
* c-c++-common/goacc/kernels-alias-5.c: Likewise.
* c-c++-common/goacc/kernels-alias-6.c: Likewise.
* c-c++-common/goacc/kernels-alias-7.c: Likewise.
* c-c++-common/goacc/kernels-alias-8.c: Likewise.
* c-c++-common/goacc/kernels-alias-ipa-pta-2.c: Likewise.
* c-c++-common/goacc/kernels-alias-ipa-pta-3.c: Likewise.
* c-c++-common/goacc/kernels-alias-ipa-pta-4.c: Likewise.
* c-c++-common/goacc/kernels-alias-ipa-pta.c: Likewise.
* c-c++-common/goacc/kernels-alias.c: Likewise.
* c-c++-common/goacc/kernels-counter-var-redundant-load.c:
Likewise.
* c-c++-common/goacc/kernels-counter-vars-function-scope.c:
Likewise.
* c-c++-common/goacc/kernels-double-reduction-n.c: Likewise.
* c-c++-common/goacc/kernels-double-reduction.c: Likewise.
* c-c++-common/goacc/kernels-loop-2-acc-loop.c: Likewise.
* c-c++-common/goacc/kernels-loop-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: Likewise.
* c-c++-common/goacc/kernels-loop-3.c: Likewise.
* c-c++-common/goacc/kernels-loop-acc-loop.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
* c-c++-common/goacc/kernels-loop-data.c: Likewise.
* c-c++-common/goacc/kernels-loop-g.c: Likewise.
* c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
* c-c++-common/goacc/kernels-loop-n-acc-loop.c: Likewise.
* c-c++-common/goacc/kernels-loop-n.c: Likewise.
* c-c++-common/goacc/kernels-loop-nest.c: Likewise.
* c-c++-common/goacc/kernels-loop.c: Likewise.
* c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c:
Likewise.
* c-c++-common/goacc/kernels-reduction.c: Likewise.
* gfortran.dg/goacc/kernels-alias-2.f95: Likewise.
* gfortran.dg/goacc/kernels-alias-3.f95: Likewise.
* gfortran.dg/goacc/kernels-alias-4.f95: Likewise.
* gfortran.dg/goacc/kernels-alias.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-inner.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
* gfortran.dg/goacc/kernels-loop.f95: Likewise.
* gfortran.dg/goacc/kernels-loops-adjacent.f95: Likewise.
* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95:
Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta-2.c:
Add "-fopenacc-kernels=parloops".
* testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-empty.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-update.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-parallel-loop-data-enter-exit.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-reduction-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data.f95: Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop.f95: Likewise.
* testsuite/libgomp.oacc-fortran/kernels-parallel-loop-data-enter-exit.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90:
Likewise.
Maciej W. Rozycki [Thu, 20 Dec 2018 14:10:17 +0000 (20 14:10 +0000)]
Add OpenACC 2.6 `acc_get_property' support
Add generic support for the OpenACC 2.6 `acc_get_property' and
`acc_get_property_string' routines, as well as full handlers for the
host and the NVPTX offload targets and a minimal handler for the HSA
offload target.
Include test cases for both C/C++ and Fortran support, both producing:
OpenACC vendor: GNU
OpenACC name: GOMP
OpenACC driver: 1.0
with the host driver and output like:
OpenACC vendor: Nvidia
OpenACC total memory:
12651462656
OpenACC free memory:
12202737664
OpenACC name: TITAN V
OpenACC driver: 9.1
with the NVPTX driver.
include/
* gomp-constants.h (GOMP_DEVICE_CURRENT): New macro.
(GOMP_DEVICE_PROPERTY_MEMORY, GOMP_DEVICE_PROPERTY_FREE_MEMORY)
(GOMP_DEVICE_PROPERTY_NAME, GOMP_DEVICE_PROPERTY_VENDOR)
(GOMP_DEVICE_PROPERTY_DRIVER): Likewise.
(GOMP_DEVICE_PROPERTY_STRING_MASK): Likewise.
libgomp/
* libgomp.h (gomp_device_descr): Add `get_property_func' member.
* libgomp-plugin.h (gomp_device_property_value): New union.
(gomp_device_property_value): New prototype.
* openacc.h (acc_device_t): Add `acc_device_current' enumeration
constant.
(acc_device_property_t): New enum.
(acc_get_property, acc_get_property_string): New prototypes.
* oacc-init.c (acc_get_device_type): Also assert on
`!acc_device_current' result.
(get_property_any, acc_get_property, acc_get_property_string):
New functions.
* openacc.f90 (openacc_kinds): From `iso_fortran_env' also
import `int64'. Add `acc_device_current' and
`acc_property_memory', `acc_property_free_memory',
`acc_property_name', `acc_property_vendor' and
`acc_property_driver' constants. Add `acc_device_property' data
type.
(openacc_internal): Add `acc_get_property' and
`acc_get_property_string' interfaces. Add `acc_get_property_h',
`acc_get_property_string_h', `acc_get_property_l' and
`acc_get_property_string_l'.
(openacc_c_string): New module.
* oacc-host.c (host_get_property): New function.
(host_dispatch): Wire it.
* target.c (gomp_load_plugin_for_device): Handle `get_property'.
* libgomp.map (OACC_2.6): Add `acc_get_property',
`acc_get_property_h_', `acc_get_property_string' and
`acc_get_property_string_h_' symbols.
* libgomp.texi (OpenACC Runtime Library Routines): Add
`acc_get_property'.
(acc_get_property): New node.
* plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): New function.
* plugin/plugin-nvptx.c (CUDA_CALLS): Add `cuDeviceGetName',
`cuDeviceTotalMem', `cuDriverGetVersion' and `cuMemGetInfo'
calls.
(GOMP_OFFLOAD_get_property): New function.
* testsuite/libgomp.oacc-c-c++-common/acc-get-property.c: New
test.
* testsuite/libgomp.oacc-fortran/acc-get-property.f: New test.
Maciej W. Rozycki [Thu, 20 Dec 2018 14:10:19 +0000 (20 14:10 +0000)]
Add OpenACC 2.6 `no_create' clause support
The clause makes any device code use the local memory address for each
of the variables specified unless the given variable is already present
on the current device.
2018-12-19 Julian Brown <julian@codesourcery.com>
Maciej W. Rozycki <macro@codesourcery.com>
gcc/
* omp-low.c (lower_omp_target): Support GOMP_MAP_NO_ALLOC.
* tree-pretty-print.c (dump_omp_clause): Likewise.
gcc/c-family/
* c-pragma.h (pragma_omp_clause): Add
PRAGMA_OACC_CLAUSE_NO_CREATE.
gcc/c/
* c-parser.c (c_parser_omp_clause_name): Support no_create.
(c_parser_oacc_data_clause): Likewise.
(c_parser_oacc_all_clauses): Likewise.
(OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
(OACC_PARALLEL_CLAUSE_MASK, OACC_SERIAL_CLAUSE_MASK): Add
PRAGMA_OACC_CLAUSE_NO_CREATE.
* c-typeck.c (handle_omp_array_sections): Support
GOMP_MAP_NO_ALLOC.
gcc/cp/
* parser.c (cp_parser_omp_clause_name): Support no_create.
(cp_parser_oacc_data_clause): Likewise.
(cp_parser_oacc_all_clauses): Likewise.
(OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
(OACC_PARALLEL_CLAUSE_MASK, OACC_SERIAL_CLAUSE_MASK): Add
PRAGMA_OACC_CLAUSE_NO_CREATE.
* semantics.c (handle_omp_array_sections): Support no_create.
gcc/fortran/
* gfortran.h (gfc_omp_map_op): Add OMP_MAP_NO_ALLOC.
* openmp.c (omp_mask2): Add OMP_CLAUSE_NO_CREATE.
(gfc_match_omp_clauses): Support no_create.
(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES)
(OACC_SERIAL_CLAUSES, OACC_DATA_CLAUSES): Add
OMP_CLAUSE_NO_CREATE.
* trans-openmp.c (gfc_trans_omp_clauses_1): Support
OMP_MAP_NO_ALLOC.
include/
* gomp-constants.h (gomp_map_kind): Support GOMP_MAP_NO_ALLOC.
libgomp/
* target.c (gomp_map_vars_async): Support GOMP_MAP_NO_ALLOC.
* testsuite/libgomp.oacc-c-c++-common/nocreate-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/nocreate-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/nocreate-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/nocreate-4.c: New test.
* testsuite/libgomp.oacc-fortran/nocreate-1.f90: New test.
* testsuite/libgomp.oacc-fortran/nocreate-2.f90: New test.
Maciej W. Rozycki [Thu, 20 Dec 2018 14:10:18 +0000 (20 14:10 +0000)]
Add OpenACC 2.6 `serial' construct support
The `serial' construct is equivalent to a `parallel' construct with
clauses `num_gangs(1) num_workers(1) vector_length(1)' implied.
Naturally these clauses are therefore not supported with the `serial'
construct. All the remaining clauses accepted with `parallel' are also
accepted with `serial'.
Consequently implementation is straightforward, by handling `serial'
exactly like `parallel', except for hardcoding dimensions rather than
taking them from the relevant clauses, in `expand_omp_target'.
Separate codes are used to denote the `serial' construct throughout the
middle end, even though the mapping of `serial' to an equivalent
`parallel' construct could have been done in the individual language
frontends, saving a lot of mechanical changes and avoiding middle-end
code expansion. This is so that any reporting such as with warning or
error messages and in diagnostic dumps use `serial' rather than
`parallel', therefore avoiding user confusion.
gcc/
* gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_SERIAL
enumeration constant.
(is_gimple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
(is_gimple_omp_offloaded): Likewise.
* gimplify.c (omp_region_type): Add ORT_ACC_SERIAL enumeration
constant. Adjust the value of ORT_NONE accordingly.
(is_gimple_stmt): Handle OACC_SERIAL.
(oacc_default_clause): Handle ORT_ACC_SERIAL.
(gomp_needs_data_present): Likewise.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_workshare): Handle OACC_SERIAL.
(gimplify_expr): Likewise.
* omp-expand.c (expand_omp_target): Handle
GF_OMP_TARGET_KIND_OACC_SERIAL.
(build_omp_regions_1, omp_make_gimple_edges): Likewise.
* omp-low.c (is_oacc_parallel): Rename function to...
(is_oacc_parallel_or_serial): ... this. Handle
GF_OMP_TARGET_KIND_OACC_SERIAL.
(build_receiver_ref): Adjust accordingly.
(build_sender_ref): Likewise.
(scan_sharing_clauses): Likewise.
(create_omp_child_function): Likewise.
(scan_omp_for): Likewise.
(scan_omp_target): Likewise.
(lower_oacc_head_mark): Likewise.
(convert_from_firstprivate_int): Likewise.
(lower_omp_target): Likewise.
(check_omp_nesting_restrictions): Handle
GF_OMP_TARGET_KIND_OACC_SERIAL.
(lower_oacc_reductions): Likewise.
(lower_omp_target): Likewise.
* tree-pretty-print.c (dump_generic_node): Handle OACC_SERIAL.
* tree.def (OACC_SERIAL): New tree code.
* doc/generic.texi (OpenACC): Document OACC_SERIAL.
gcc/c-family/
* c-pragma.h (pragma_kind): Add PRAGMA_OACC_SERIAL enumeration
constant.
* c-pragma.c (oacc_pragmas): Add "serial" entry.
gcc/c/
* c-parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
(OACC_SERIAL_CLAUSE_DEVICE_TYPE_MASK): Likewise.
(c_parser_oacc_kernels_parallel): Rename function to...
(c_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL.
(c_parser_omp_construct): Update accordingly.
gcc/cp/
* constexpr.c (potential_constant_expression_1): Handle
OACC_SERIAL.
* parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
(OACC_SERIAL_CLAUSE_DEVICE_TYPE_MASK): Likewise.
(cp_parser_oacc_kernels_parallel): Rename function to...
(cp_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL.
(cp_parser_omp_construct): Update accordingly.
(cp_parser_pragma): Handle PRAGMA_OACC_SERIAL. Fix alphabetic
order.
* pt.c (tsubst_expr): Handle OACC_SERIAL.
gcc/fortran/
* gfortran.h (gfc_statement): Add ST_OACC_SERIAL_LOOP,
ST_OACC_END_SERIAL_LOOP, ST_OACC_SERIAL and ST_OACC_END_SERIAL
enumeration constants.
(gfc_exec_op): Add EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL
enumeration constants.
* match.h (gfc_match_oacc_serial): New prototype.
(gfc_match_oacc_serial_loop): Likewise.
* dump-parse-tree.c (show_omp_node, show_code_node): Handle
EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
* match.c (match_exit_cycle): Handle EXEC_OACC_SERIAL_LOOP.
* openmp.c (OACC_SERIAL_CLAUSES): New macro.
(OACC_SERIAL_CLAUSE_DEVICE_TYPE_MASK): Likewise.
(gfc_match_oacc_serial_loop): New function.
(gfc_match_oacc_serial): Likewise.
(oacc_is_loop): Handle EXEC_OACC_SERIAL_LOOP.
(resolve_omp_clauses): Handle EXEC_OACC_SERIAL.
(oacc_is_serial): New function.
(oacc_code_to_statement): Handle EXEC_OACC_SERIAL and
EXEC_OACC_SERIAL_LOOP.
(gfc_resolve_oacc_directive): Likewise.
* parse.c (decode_oacc_directive) <'s'>: Add case for "serial"
and "serial loop".
(next_statement): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL.
(gfc_ascii_statement): Likewise. Handle ST_OACC_END_SERIAL_LOOP
and ST_OACC_END_SERIAL.
(parse_oacc_structured_block): Handle ST_OACC_SERIAL.
(parse_oacc_loop): Handle ST_OACC_SERIAL_LOOP and
ST_OACC_END_SERIAL_LOOP.
(parse_executable): Handle ST_OACC_SERIAL_LOOP and
ST_OACC_SERIAL.
(is_oacc): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise.
* st.c (gfc_free_statement): Likewise.
* trans-openmp.c (gfc_trans_oacc_construct): Handle
EXEC_OACC_SERIAL.
(gfc_trans_oacc_combined_directive): Handle
EXEC_OACC_SERIAL_LOOP.
(gfc_trans_oacc_directive): Handle EXEC_OACC_SERIAL_LOOP and
EXEC_OACC_SERIAL.
* trans.c (trans_code): Likewise.
gcc/testsuite/
* c-c++-common/goacc/serial-dims.c: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/serial-dims.c: New test.
Cesar Philippidis [Thu, 21 Dec 2017 21:40:34 +0000 (21 13:40 -0800)]
Use functional parameters for data mappings in OpenACC child functions
* Makefile.def: Make libgomp depend on libffi.
* configure.ac: Likewise.
* Makefile.in: Regenerate.
* configure: Regenerate.
gcc/fortran/
* types.def: (BF_FN_VOID_INT_INT_OMPFN_SIZE_PTR_PTR_PTR_VAR):
Define.
gcc/
* builtin-types.def (BF_FN_VOID_INT_INT_OMPFN_SIZE_PTR_PTR_PTR_VAR):
Define.
* config/nvptx/nvptx.c (nvptx_expand_cmp_swap): Handle PARM_DECLs.
* omp-builtins.def (BUILD_IN_GOACC_PARALLEL): Call
GOACC_parallel_keyed_v2.
* omp-expand.c (expand_omp_target): Update call to
BUILT_IN_GOACC_PARALLEL.
* omp-low.c (struct omp_context): Add parm_map member.
(lookup_parm): New function.
(build_receiver_ref): Lookup parm_map decls.
(install_parm_decl): New function.
(install_var_field): Install parm_map decl for OpenACC parallel region
data clauses.
(delete_omp_context): Clean parm_map.
(scan_sharing_clauses): Install subarray variable mapping into parm_map.
(create_omp_child_function): Defer creation of child function for
OpenACC parallel regions.
(scan_omp_target): Likewise.
(append_decl_arg): New function.
(lower_omp_target): Create an child offloaded function using one
parameter per data mapping for OpenACC parallel regions.
* tree-ssa-structalias.c (find_func_aliases_for_builtin_call):
Ignore OpenACC parallel regions.
(find_func_clobbers): Likewise.
(ipa_pta_execute): Likewise.
libgomp/
* Makefile.am: Add libffi build dependency.
* configure.ac: Likewise.
* Makefile.in: Regenerate.
* config.h.in: Regenerate.
* configure: Regenerate.
* libgomp-plugin.h: Define GOMP_OFFLOAD_openacc_exec_params and
GOMP_OFFLOAD_openacc_async_exec_params.
* libgomp.h (acc_dispatch_t): Use them here.
* libgomp.map (GOACC_parallel_keyed_v2): Declare.
* libgomp_g.h (GOACC_parallel_keyed_v2): Likewise.
* oacc-host.c (host_openacc_exec_params): New function.
(host_openacc_async_exec_params): Likewise.
* oacc-parallel.c (goacc_call_host_fn): Likewise.
(GOACC_parallel_keyed_internal): Likewise.
(GOACC_parallel_keyed): Wrapper for GOACC_parallel_keyed_internal.
(GOACC_parallel_keyed_v2): Likewise.
* plugin/plugin-nvptx.c (nvptx_exec): Replace CUDeviceptr dp parameter
with void **kargs.
(openacc_exec_internal): New function.
(GOMP_OFFLOAD_openacc_exec_params): New function.
(GOMP_OFFLOAD_openacc_exec): Update to call openacc_exec_internal.
(openacc_async_exec_internal): New function.
(GOMP_OFFLOAD_openacc_async_exec_params): New function.
(GOMP_OFFLOAD_openacc_async_exec): Update call to
openacc_async_exec_internal.
* target.c (gomp_load_plugin_for_device): Handle
openacc_exec_params and openacc_async_exec_params.
* testsuite/Makefile.in: Regenerate.
* testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c:
Xfail on offloaded targets.
* Makefile.def: Bootstrap module libffi. Add libffi dependency
to all-target-libgomp.
* Makefile.in: Regenerate.
* configure.ac: Add libffi to bootstrap_target_libs when libgomp
is bootstrapped.
* configure: Regenerate.
gcc/
* omp-low.c (install_parm_decl): Don't extract identifiers from
artifical decls.
gcc/testsuite/
* c-c++-common/goacc/large_array.c: New test.
(cherry picked from openacc-gcc-7-branch commit
b4dd21b9a1f9f499c613b55225cad689b7928a7f, commit
9ba1d875dcb9412cccdd49138a3525e7adab3e76, commit
762cf3c7890fab15a69494a6480455cd99621d7d, and commit
6585af7290fd79f6cb834a39c2bbf7e1934808b1)
Gergö Barany [Fri, 21 Dec 2018 09:12:44 +0000 (21 01:12 -0800)]
Add OpenACC 2.6 if and if_present clauses on host_data construct: GOACC_FLAG_HOST_DATA_IF_PRESENT
gcc/c/
* c-parser.c (OACC_HOST_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_IF
and PRAGMA_OACC_CLAUSE_IF_PRESENT.
gcc/cp/
* parser.c (OACC_HOST_DATA_CLAUSE_MASK): Likewise.
gcc/fortran/
* openmp.c (OACC_HOST_DATA_CLAUSES): Add OMP_CLAUSE_IF and
OMP_CLAUSE_IF_PRESENT.
gcc/
* omp-expand.c (expand_omp_target): Handle if_present flag on
OpenACC host_data construct.
gcc/testsuite/
* c-c++-common/goacc/host_data-1.c: Add tests of if and if_present
clauses on host_data.
* gfortran.dg/goacc/host_data-tree.f95: Likewise.
include/
* gomp-constants.h (GOACC_FLAG_HOST_DATA_IF_PRESENT): New constant.
libgomp/
* libgomp.h (enum gomp_map_vars_kind): Add
GOMP_MAP_VARS_OPENACC_IF_PRESENT.
* oacc-parallel.c (GOACC_data_start): Handle
GOACC_FLAG_HOST_DATA_IF_PRESENT flag.
* target.c (gomp_map_vars_async): Handle
GOMP_MAP_VARS_OPENACC_IF_PRESENT mapping kind.
* testsuite/libgomp.oacc-c-c++-common/host_data-6.c: New test.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
Gergö Barany [Thu, 20 Dec 2018 14:07:34 +0000 (20 15:07 +0100)]
Report errors on missing OpenACC reduction clauses in nested reductions
..., as suggested by OpenACC 2.6, 2.9.11. "reduction clause".
In gcc/testsuite/c-c++-common/goacc/reduction-6.c, we remove the erroneous
reductions on variable b; adding a reduction clause to make it compile cleanly
would make it a duplicate of the test for variable c.
gcc/
* omp-low.c (struct omp_context): New fields
local_reduction_clauses, outer_reduction_clauses.
(new_omp_context): Initialize these.
(scan_sharing_clauses): Record reduction clauses on OpenACC
constructs.
(scan_omp_for): Check reduction clauses for incorrect nesting.
gcc/testsuite/
* c-c++-common/goacc/nested-reductions-fail.c: New test.
* c-c++-common/goacc/nested-reductions.c: New test.
* c-c++-common/goacc/reduction-6.c: Adjust.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-1.c:
Add missing reduction clauses.
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c:
Likewise.
Maciej W. Rozycki [Thu, 20 Dec 2018 14:10:16 +0000 (20 14:10 +0000)]
Disable libstdc++ dependency for libffi
Disable AC_PROG_CXX and consequently a libstdc++ dependency for libffi,
introduced with upstream libffi commit
7d698125b1f0 ("Use the proper C++
compiler to run C++ tests"). This is only needed for the libffi test
suite, which we don't have to support in the GCC tree, as libffi is
maintained as a separate project. The dependency causes a build failure
with the `powerpc64le-linux-gnu' target due to a circular dependency:
make[1]: Circular configure-target-libffi <- maybe-all-target-libstdc++-v3 dependency dropped.
make[1]: *** [configure-target-libffi] Error 1
make: *** [all] Error 2
due to a libgomp dependency for libstdc++ and then a libffi dependency
for libgomp, introduced with commit
998eb38b265d ("Use functional
parameters for data mappings in OpenACC child functions").
/
* Makefile.def (lang_env_dependencies): Disable `cxx' dependency
for `libffi'.
* Makefile.in: Regenerate.
libffi/
* configure.ac: Disable AC_PROG_CXX.
* configure: Regenerate.
* Makefile.in: Regenerate.
* include/Makefile.in: Regenerate.
* man/Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.
Kwok Cheung Yeung [Thu, 31 Jan 2019 16:00:16 +0000 (31 08:00 -0800)]
Allow optional arguments to be used in the use_device OpenACC clause
Optional arguments should be treated as references rather than pointers
in the lowering. However, for non-present arguments, this would result
in a null dereference, so conditionals need to be added to detect and
handle this.
gcc/
* omp-low.c (lower_omp_target): For use_device clauses, generate
conditional statements to treat Fortran optional arguments like
references if non-null, or propogate null arguments into offloaded
code otherwise.
Reviewed-by: Julian Brown <julian@codesourcery.com>