1 \input texinfo @c -*-texinfo-*-
4 @setfilename libgomp.info
10 Copyright @copyright{} 2006-2015 Free Software Foundation, Inc.
12 Permission is granted to copy, distribute and/or modify this document
13 under the terms of the GNU Free Documentation License, Version 1.3 or
14 any later version published by the Free Software Foundation; with the
15 Invariant Sections being ``Funding Free Software'', the Front-Cover
16 texts being (a) (see below), and with the Back-Cover Texts being (b)
17 (see below). A copy of the license is included in the section entitled
18 ``GNU Free Documentation License''.
20 (a) The FSF's Front-Cover Text is:
24 (b) The FSF's Back-Cover Text is:
26 You have freedom to copy and modify this GNU Manual, like GNU
27 software. Copies published by the Free Software Foundation raise
28 funds for GNU development.
32 @dircategory GNU Libraries
34 * libgomp: (libgomp). GNU Offloading and Multi Processing Runtime Library.
37 This manual documents libgomp, the GNU Offloading and Multi Processing
38 Runtime library. This is the GNU implementation of the OpenMP and
39 OpenACC APIs for parallel and accelerator programming in C/C++ and
42 Published by the Free Software Foundation
43 51 Franklin Street, Fifth Floor
44 Boston, MA 02110-1301 USA
50 @setchapternewpage odd
53 @title GNU Offloading and Multi Processing Runtime Library
54 @subtitle The GNU OpenMP and OpenACC Implementation
56 @vskip 0pt plus 1filll
57 @comment For the @value{version-GCC} Version*
59 Published by the Free Software Foundation @*
60 51 Franklin Street, Fifth Floor@*
61 Boston, MA 02110-1301, USA@*
75 This manual documents the usage of libgomp, the GNU Offloading and
76 Multi Processing Runtime Library. This includes the GNU
77 implementation of the @uref{http://www.openmp.org, OpenMP} Application
78 Programming Interface (API) for multi-platform shared-memory parallel
79 programming in C/C++ and Fortran, and the GNU implementation of the
80 @uref{http://www.openacc.org/, OpenACC} Application Programming
81 Interface (API) for offloading of code to accelerator devices in C/C++
84 Originally, libgomp implemented the GNU OpenMP Runtime Library. Based
85 on this, support for OpenACC and offloading (both OpenACC and OpenMP
86 4's target construct) has been added later on, and the library's name
87 changed to GNU Offloading and Multi Processing Runtime Library.
92 @comment When you add a new menu item, please keep the right hand
93 @comment aligned to the same column. Do not use tabs. This provides
94 @comment better formatting.
97 * Enabling OpenACC:: How to enable OpenACC for your
99 * OpenACC Runtime Library Routines:: The OpenACC runtime application
100 programming interface.
101 * OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
102 environment variables.
103 * OpenACC Library Interoperability:: OpenACC library interoperability with the
104 NVIDIA CUBLAS library.
105 * Enabling OpenMP:: How to enable OpenMP for your
107 * OpenMP Runtime Library Routines: Runtime Library Routines.
108 The OpenMP runtime application programming
110 * OpenMP Environment Variables: Environment Variables.
111 Influencing OpenMP runtime behavior with
112 environment variables.
113 * The libgomp ABI:: Notes on the external libgomp ABI.
114 * Reporting Bugs:: How to report bugs in the GNU Offloading
115 and Multi Processing Runtime Library.
116 * Copying:: GNU general public license says how you
117 can copy and share libgomp.
118 * GNU Free Documentation License:: How you can copy and share this manual.
119 * Funding:: How to help assure continued work for free
121 * Library Index:: Index of this documentation.
126 @c ---------------------------------------------------------------------
128 @c ---------------------------------------------------------------------
130 @node Enabling OpenACC
131 @chapter Enabling OpenACC
133 To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
134 flag @command{-fopenacc} must be specified. This enables the OpenACC directive
135 @code{#pragma acc} in C/C++ and @code{!$accp} directives in free form,
136 @code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
137 @code{!$} conditional compilation sentinels in free form and @code{c$},
138 @code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
139 arranges for automatic linking of the OpenACC runtime library
140 (@ref{OpenACC Runtime Library Routines}).
142 A complete description of all OpenACC directives accepted may be found in
143 the @uref{http://www.openacc.org/, OpenMP Application Programming
144 Interface} manual, version 2.0.
146 Note that this is an experimental feature, incomplete, and subject to
147 change in future versions of GCC. See
148 @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
152 @c ---------------------------------------------------------------------
153 @c OpenACC Runtime Library Routines
154 @c ---------------------------------------------------------------------
156 @node OpenACC Runtime Library Routines
157 @chapter OpenACC Runtime Library Routines
159 The runtime routines described here are defined by section 3 of the OpenACC
160 specifications in version 2.0.
161 They have C linkage, and do not throw exceptions.
162 Generally, they are available only for the host, with the exception of
163 @code{acc_on_device}, which is available for both the host and the
167 * acc_get_num_devices:: Get number of devices for the given device type
168 * acc_set_device_type::
169 * acc_get_device_type::
170 * acc_set_device_num::
171 * acc_get_device_num::
174 * acc_on_device:: Whether executing on a particular device
178 * acc_present_or_copyin::
180 * acc_present_or_create::
183 * acc_update_device::
190 * acc_memcpy_to_device::
191 * acc_memcpy_from_device::
193 API routines for target platforms.
195 * acc_get_current_cuda_device::
196 * acc_get_current_cuda_context::
197 * acc_get_cuda_stream::
198 * acc_set_cuda_stream::
203 @node acc_get_num_devices
204 @section @code{acc_get_num_devices} -- Get number of devices for given device type
206 @item @emph{Description}
207 This routine returns a value indicating the
208 number of devices available for the given device type. It determines
209 the number of devices in a @emph{passive} manner. In other words, it
210 does not alter the state within the runtime environment aside from
211 possibly initializing an uninitialized device. This aspect allows
212 the routine to be called without concern for altering the interaction
213 with an attached accelerator device.
215 @item @emph{Reference}:
216 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
222 @node acc_set_device_type
223 @section @code{acc_set_device_type}
225 @item @emph{Reference}:
226 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
232 @node acc_get_device_type
233 @section @code{acc_get_device_type}
235 @item @emph{Reference}:
236 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
242 @node acc_set_device_num
243 @section @code{acc_set_device_num}
245 @item @emph{Reference}:
246 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
252 @node acc_get_device_num
253 @section @code{acc_get_device_num}
255 @item @emph{Reference}:
256 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
263 @section @code{acc_init}
265 @item @emph{Reference}:
266 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
273 @section @code{acc_shutdown}
275 @item @emph{Reference}:
276 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
283 @section @code{acc_on_device} -- Whether executing on a particular device
285 @item @emph{Description}:
286 This routine tells the program whether it is executing on a particular
287 device. Based on the argument passed, GCC tries to evaluate this to a
288 constant at compile time, but library functions are also provided, for
289 both the host and the acceleration device.
291 @item @emph{Reference}:
292 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
299 @section @code{acc_malloc}
301 @item @emph{Reference}:
302 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
309 @section @code{acc_free}
311 @item @emph{Reference}:
312 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
319 @section @code{acc_copyin}
321 @item @emph{Reference}:
322 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
328 @node acc_present_or_copyin
329 @section @code{acc_present_or_copyin}
331 @item @emph{Reference}:
332 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
339 @section @code{acc_create}
341 @item @emph{Reference}:
342 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
348 @node acc_present_or_create
349 @section @code{acc_present_or_create}
351 @item @emph{Reference}:
352 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
359 @section @code{acc_copyout}
361 @item @emph{Reference}:
362 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
369 @section @code{acc_delete}
371 @item @emph{Reference}:
372 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
378 @node acc_update_device
379 @section @code{acc_update_device}
381 @item @emph{Reference}:
382 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
388 @node acc_update_self
389 @section @code{acc_update_self}
391 @item @emph{Reference}:
392 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
399 @section @code{acc_map_data}
401 @item @emph{Reference}:
402 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
409 @section @code{acc_unmap_data}
411 @item @emph{Reference}:
412 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
419 @section @code{acc_deviceptr}
421 @item @emph{Reference}:
422 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
429 @section @code{acc_hostptr}
431 @item @emph{Reference}:
432 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
439 @section @code{acc_is_present}
441 @item @emph{Reference}:
442 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
448 @node acc_memcpy_to_device
449 @section @code{acc_memcpy_to_device}
451 @item @emph{Reference}:
452 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
458 @node acc_memcpy_from_device
459 @section @code{acc_memcpy_from_device}
461 @item @emph{Reference}:
462 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
468 @node acc_get_current_cuda_device
469 @section @code{acc_get_current_cuda_device}
471 @item @emph{Reference}:
472 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
478 @node acc_get_current_cuda_context
479 @section @code{acc_get_current_cuda_context}
481 @item @emph{Reference}:
482 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
488 @node acc_get_cuda_stream
489 @section @code{acc_get_cuda_stream}
491 @item @emph{Reference}:
492 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
498 @node acc_set_cuda_stream
499 @section @code{acc_set_cuda_stream}
501 @item @emph{Reference}:
502 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
508 @c ---------------------------------------------------------------------
509 @c OpenACC Environment Variables
510 @c ---------------------------------------------------------------------
512 @node OpenACC Environment Variables
513 @chapter OpenACC Environment Variables
515 The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
516 are defined by section 4 of the OpenACC specification in version 2.0.
517 The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
527 @node ACC_DEVICE_TYPE
528 @section @code{ACC_DEVICE_TYPE}
530 @item @emph{Reference}:
531 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
538 @section @code{ACC_DEVICE_NUM}
540 @item @emph{Reference}:
541 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
548 @section @code{GCC_ACC_NOTIFY}
550 @item @emph{Description}:
551 Print debug information pertaining to the accelerator.
555 @c ---------------------------------------------------------------------
556 @c OpenACC Library Interoperability
557 @c ---------------------------------------------------------------------
559 @node OpenACC Library Interoperability
560 @chapter OpenACC Library Interoperability
562 @section Introduction
564 As the OpenACC library is built using the CUDA Driver API, the question has
565 arisen on what impact does using the OpenACC library have on a program that
566 uses the Runtime library, or a library based on the Runtime library, e.g.,
567 CUBLAS@footnote{Seee section 2.26, "Interactions with the CUDA Driver API" in
568 "CUDA Runtime API", Version 5.5, July 2013 and section 2.27, "VDPAU
569 Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
570 July 2013, for additional information on library interoperability.}.
571 This chapter will describe the use cases and what changes are
572 required in order to use both the OpenACC library and the CUBLAS and Runtime
573 libraries within a program.
575 @section First invocation: NVIDIA CUBLAS library API
577 In this first use case (see below), a function in the CUBLAS library is called
578 prior to any of the functions in the OpenACC library. More specifically, the
579 function @code{cublasCreate()}.
581 When invoked, the function will initialize the library and allocate the
582 hardware resources on the host and the device on behalf of the caller. Once
583 the initialization and allocation has completed, a handle is returned to the
584 caller. The OpenACC library also requires initialization and allocation of
585 hardware resources. Since the CUBLAS library has already allocated the
586 hardware resources for the device, all that is left to do is to initialize
587 the OpenACC library and acquire the hardware resources on the host.
589 Prior to calling the OpenACC function that will initialize the library and
590 allocate the host hardware resources, one needs to acquire the device number
591 that was allocated during the call to @code{cublasCreate()}. The invoking of the
592 runtime library function @code{cudaGetDevice()} will accomplish this. Once
593 acquired, the device number is passed along with the device type as
594 parameters to the OpenACC library function @code{acc_set_device_num()}.
596 Once the call to @code{acc_set_device_num()} has completed, the OpenACC
597 library will be using the context that was created during the call to
598 @code{cublasCreate()}. In other words, both libraries will be sharing the
602 /* Create the handle */
603 s = cublasCreate(&h);
604 if (s != CUBLAS_STATUS_SUCCESS)
606 fprintf(stderr, "cublasCreate failed %d\n", s);
610 /* Get the device number */
611 e = cudaGetDevice(&dev);
612 if (e != cudaSuccess)
614 fprintf(stderr, "cudaGetDevice failed %d\n", e);
618 /* Initialize OpenACC library and use device 'dev' */
619 acc_set_device_num(dev, acc_device_nvidia);
624 @section First invocation: OpenACC library API
626 In this second use case (see below), a function in the OpenACC library is
627 called prior to any of the functions in the CUBLAS library. More specificially,
628 the function acc_set_device_num().
630 In the use case presented here, the function @code{acc_set_device_num()}
631 is used to both initialize the OpenACC library and allocate the hardware
632 resources on the host and the device. In the call to the function, the
633 call parameters specify which device to use, i.e., 'dev', and what device
634 type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
635 is but one method to initialize the OpenACC library and allocate the
636 appropriate hardware resources. Other methods are available through the
637 use of environment variables and these will be discussed in the next section.
639 Once the call to @code{acc_set_device_num()} has completed, other OpenACC
640 functions can be called as seen with multiple calls being made to
641 @code{acc_copyin()}. In addition, calls can be made to functions in the
642 CUBLAS library. In the use case a call to @code{cublasCreate()} is made
643 subsequent to the calls to @code{acc_copyin()}.
644 As seen in the previous use case, a call to @code{cublasCreate()} will
645 initialize the CUBLAS library and allocate the hardware resources on the
646 host and the device. However, since the device has already been allocated,
647 @code{cublasCreate()} will only initialize the CUBLAS library and allocate
648 the appropriate hardware resources on the host. The context that was created
649 as part of the OpenACC initialization will be shared with the CUBLAS library,
650 similarly to the first use case.
655 acc_set_device_num(dev, acc_device_nvidia);
657 /* Copy the first set to the device */
658 d_X = acc_copyin(&h_X[0], N * sizeof (float));
661 fprintf(stderr, "copyin error h_X\n");
665 /* Copy the second set to the device */
666 d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
669 fprintf(stderr, "copyin error h_Y1\n");
673 /* Create the handle */
674 s = cublasCreate(&h);
675 if (s != CUBLAS_STATUS_SUCCESS)
677 fprintf(stderr, "cublasCreate failed %d\n", s);
681 /* Perform saxpy using CUBLAS library function */
682 s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
683 if (s != CUBLAS_STATUS_SUCCESS)
685 fprintf(stderr, "cublasSaxpy failed %d\n", s);
689 /* Copy the results from the device */
690 acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
696 @section OpenACC library and environment variables
698 There are two environment variables associated with the OpenACC library that
699 may be used to control the device type and device number.
700 Namely, @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}. In the second
701 use case, the device type and device number were specified using
702 @code{acc_set_device_num()}. However, @env{ACC_DEVICE_TYPE} and
703 @env{ACC_DEVICE_NUM} could have been defined and the call to
704 @code{acc_set_device_num()} would be not be required. At the time of the
705 call to @code{acc_copyin()}, these two environment variables would be
706 sampled and their values used.
708 The use of the environment variables is only relevant when an OpenACC function
709 is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
710 is called prior to a call to an OpenACC function, then a call to
711 @code{acc_set_device_num()}, must be done@footnote{More complete information
712 about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
713 sections 4.1 and 4.2 of the “The OpenACC
714 Application Programming Interface”, Version 2.0, June, 2013.}.
718 @c ---------------------------------------------------------------------
720 @c ---------------------------------------------------------------------
722 @node Enabling OpenMP
723 @chapter Enabling OpenMP
725 To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
726 flag @command{-fopenmp} must be specified. This enables the OpenMP directive
727 @code{#pragma omp} in C/C++ and @code{!$omp} directives in free form,
728 @code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form,
729 @code{!$} conditional compilation sentinels in free form and @code{c$},
730 @code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
731 arranges for automatic linking of the OpenMP runtime library
732 (@ref{Runtime Library Routines}).
734 A complete description of all OpenMP directives accepted may be found in
735 the @uref{http://www.openmp.org, OpenMP Application Program Interface} manual,
739 @c ---------------------------------------------------------------------
740 @c OpenMP Runtime Library Routines
741 @c ---------------------------------------------------------------------
743 @node Runtime Library Routines
744 @chapter OpenMP Runtime Library Routines
746 The runtime routines described here are defined by Section 3 of the OpenMP
747 specification in version 4.0. The routines are structured in following
751 Control threads, processors and the parallel environment. They have C
752 linkage, and do not throw exceptions.
754 * omp_get_active_level:: Number of active parallel regions
755 * omp_get_ancestor_thread_num:: Ancestor thread ID
756 * omp_get_cancellation:: Whether cancellation support is enabled
757 * omp_get_default_device:: Get the default device for target regions
758 * omp_get_dynamic:: Dynamic teams setting
759 * omp_get_level:: Number of parallel regions
760 * omp_get_max_active_levels:: Maximum number of active regions
761 * omp_get_max_threads:: Maximum number of threads of parallel region
762 * omp_get_nested:: Nested parallel regions
763 * omp_get_num_devices:: Number of target devices
764 * omp_get_num_procs:: Number of processors online
765 * omp_get_num_teams:: Number of teams
766 * omp_get_num_threads:: Size of the active team
767 * omp_get_proc_bind:: Whether theads may be moved between CPUs
768 * omp_get_schedule:: Obtain the runtime scheduling method
769 * omp_get_team_num:: Get team number
770 * omp_get_team_size:: Number of threads in a team
771 * omp_get_thread_limit:: Maximum number of threads
772 * omp_get_thread_num:: Current thread ID
773 * omp_in_parallel:: Whether a parallel region is active
774 * omp_in_final:: Whether in final or included task region
775 * omp_is_initial_device:: Whether executing on the host device
776 * omp_set_default_device:: Set the default device for target regions
777 * omp_set_dynamic:: Enable/disable dynamic teams
778 * omp_set_max_active_levels:: Limits the number of active parallel regions
779 * omp_set_nested:: Enable/disable nested parallel regions
780 * omp_set_num_threads:: Set upper team size limit
781 * omp_set_schedule:: Set the runtime scheduling method
783 Initialize, set, test, unset and destroy simple and nested locks.
785 * omp_init_lock:: Initialize simple lock
786 * omp_set_lock:: Wait for and set simple lock
787 * omp_test_lock:: Test and set simple lock if available
788 * omp_unset_lock:: Unset simple lock
789 * omp_destroy_lock:: Destroy simple lock
790 * omp_init_nest_lock:: Initialize nested lock
791 * omp_set_nest_lock:: Wait for and set simple lock
792 * omp_test_nest_lock:: Test and set nested lock if available
793 * omp_unset_nest_lock:: Unset nested lock
794 * omp_destroy_nest_lock:: Destroy nested lock
796 Portable, thread-based, wall clock timer.
798 * omp_get_wtick:: Get timer precision.
799 * omp_get_wtime:: Elapsed wall clock time.
804 @node omp_get_active_level
805 @section @code{omp_get_active_level} -- Number of parallel regions
807 @item @emph{Description}:
808 This function returns the nesting level for the active parallel blocks,
809 which enclose the calling call.
812 @multitable @columnfractions .20 .80
813 @item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
816 @item @emph{Fortran}:
817 @multitable @columnfractions .20 .80
818 @item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
821 @item @emph{See also}:
822 @ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
824 @item @emph{Reference}:
825 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.20.
830 @node omp_get_ancestor_thread_num
831 @section @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
833 @item @emph{Description}:
834 This function returns the thread identification number for the given
835 nesting level of the current thread. For values of @var{level} outside
836 zero to @code{omp_get_level} -1 is returned; if @var{level} is
837 @code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
840 @multitable @columnfractions .20 .80
841 @item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
844 @item @emph{Fortran}:
845 @multitable @columnfractions .20 .80
846 @item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
847 @item @tab @code{integer level}
850 @item @emph{See also}:
851 @ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
853 @item @emph{Reference}:
854 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.18.
859 @node omp_get_cancellation
860 @section @code{omp_get_cancellation} -- Whether cancellation support is enabled
862 @item @emph{Description}:
863 This function returns @code{true} if cancellation is activated, @code{false}
864 otherwise. Here, @code{true} and @code{false} represent their language-specific
865 counterparts. Unless @env{OMP_CANCELLATION} is set true, cancellations are
869 @multitable @columnfractions .20 .80
870 @item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
873 @item @emph{Fortran}:
874 @multitable @columnfractions .20 .80
875 @item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
878 @item @emph{See also}:
879 @ref{OMP_CANCELLATION}
881 @item @emph{Reference}:
882 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.9.
887 @node omp_get_default_device
888 @section @code{omp_get_default_device} -- Get the default device for target regions
890 @item @emph{Description}:
891 Get the default device for target regions without device clause.
894 @multitable @columnfractions .20 .80
895 @item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
898 @item @emph{Fortran}:
899 @multitable @columnfractions .20 .80
900 @item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
903 @item @emph{See also}:
904 @ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
906 @item @emph{Reference}:
907 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.24.
912 @node omp_get_dynamic
913 @section @code{omp_get_dynamic} -- Dynamic teams setting
915 @item @emph{Description}:
916 This function returns @code{true} if enabled, @code{false} otherwise.
917 Here, @code{true} and @code{false} represent their language-specific
920 The dynamic team setting may be initialized at startup by the
921 @env{OMP_DYNAMIC} environment variable or at runtime using
922 @code{omp_set_dynamic}. If undefined, dynamic adjustment is
926 @multitable @columnfractions .20 .80
927 @item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
930 @item @emph{Fortran}:
931 @multitable @columnfractions .20 .80
932 @item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
935 @item @emph{See also}:
936 @ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
938 @item @emph{Reference}:
939 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.8.
945 @section @code{omp_get_level} -- Obtain the current nesting level
947 @item @emph{Description}:
948 This function returns the nesting level for the parallel blocks,
949 which enclose the calling call.
952 @multitable @columnfractions .20 .80
953 @item @emph{Prototype}: @tab @code{int omp_get_level(void);}
956 @item @emph{Fortran}:
957 @multitable @columnfractions .20 .80
958 @item @emph{Interface}: @tab @code{integer function omp_level()}
961 @item @emph{See also}:
962 @ref{omp_get_active_level}
964 @item @emph{Reference}:
965 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.17.
970 @node omp_get_max_active_levels
971 @section @code{omp_get_max_active_levels} -- Maximum number of active regions
973 @item @emph{Description}:
974 This function obtains the maximum allowed number of nested, active parallel regions.
977 @multitable @columnfractions .20 .80
978 @item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
981 @item @emph{Fortran}:
982 @multitable @columnfractions .20 .80
983 @item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
986 @item @emph{See also}:
987 @ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
989 @item @emph{Reference}:
990 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.16.
995 @node omp_get_max_threads
996 @section @code{omp_get_max_threads} -- Maximum number of threads of parallel region
998 @item @emph{Description}:
999 Return the maximum number of threads used for the current parallel region
1000 that does not use the clause @code{num_threads}.
1003 @multitable @columnfractions .20 .80
1004 @item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
1007 @item @emph{Fortran}:
1008 @multitable @columnfractions .20 .80
1009 @item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
1012 @item @emph{See also}:
1013 @ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
1015 @item @emph{Reference}:
1016 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.3.
1021 @node omp_get_nested
1022 @section @code{omp_get_nested} -- Nested parallel regions
1024 @item @emph{Description}:
1025 This function returns @code{true} if nested parallel regions are
1026 enabled, @code{false} otherwise. Here, @code{true} and @code{false}
1027 represent their language-specific counterparts.
1029 Nested parallel regions may be initialized at startup by the
1030 @env{OMP_NESTED} environment variable or at runtime using
1031 @code{omp_set_nested}. If undefined, nested parallel regions are
1032 disabled by default.
1035 @multitable @columnfractions .20 .80
1036 @item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
1039 @item @emph{Fortran}:
1040 @multitable @columnfractions .20 .80
1041 @item @emph{Interface}: @tab @code{logical function omp_get_nested()}
1044 @item @emph{See also}:
1045 @ref{omp_set_nested}, @ref{OMP_NESTED}
1047 @item @emph{Reference}:
1048 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.11.
1053 @node omp_get_num_devices
1054 @section @code{omp_get_num_devices} -- Number of target devices
1056 @item @emph{Description}:
1057 Returns the number of target devices.
1060 @multitable @columnfractions .20 .80
1061 @item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
1064 @item @emph{Fortran}:
1065 @multitable @columnfractions .20 .80
1066 @item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
1069 @item @emph{Reference}:
1070 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.25.
1075 @node omp_get_num_procs
1076 @section @code{omp_get_num_procs} -- Number of processors online
1078 @item @emph{Description}:
1079 Returns the number of processors online on that device.
1082 @multitable @columnfractions .20 .80
1083 @item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
1086 @item @emph{Fortran}:
1087 @multitable @columnfractions .20 .80
1088 @item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
1091 @item @emph{Reference}:
1092 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.5.
1097 @node omp_get_num_teams
1098 @section @code{omp_get_num_teams} -- Number of teams
1100 @item @emph{Description}:
1101 Returns the number of teams in the current team region.
1104 @multitable @columnfractions .20 .80
1105 @item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
1108 @item @emph{Fortran}:
1109 @multitable @columnfractions .20 .80
1110 @item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
1113 @item @emph{Reference}:
1114 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.26.
1119 @node omp_get_num_threads
1120 @section @code{omp_get_num_threads} -- Size of the active team
1122 @item @emph{Description}:
1123 Returns the number of threads in the current team. In a sequential section of
1124 the program @code{omp_get_num_threads} returns 1.
1126 The default team size may be initialized at startup by the
1127 @env{OMP_NUM_THREADS} environment variable. At runtime, the size
1128 of the current team may be set either by the @code{NUM_THREADS}
1129 clause or by @code{omp_set_num_threads}. If none of the above were
1130 used to define a specific value and @env{OMP_DYNAMIC} is disabled,
1131 one thread per CPU online is used.
1134 @multitable @columnfractions .20 .80
1135 @item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
1138 @item @emph{Fortran}:
1139 @multitable @columnfractions .20 .80
1140 @item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
1143 @item @emph{See also}:
1144 @ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
1146 @item @emph{Reference}:
1147 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.2.
1152 @node omp_get_proc_bind
1153 @section @code{omp_get_proc_bind} -- Whether theads may be moved between CPUs
1155 @item @emph{Description}:
1156 This functions returns the currently active thread affinity policy, which is
1157 set via @env{OMP_PROC_BIND}. Possible values are @code{omp_proc_bind_false},
1158 @code{omp_proc_bind_true}, @code{omp_proc_bind_master},
1159 @code{omp_proc_bind_close} and @code{omp_proc_bind_spread}.
1162 @multitable @columnfractions .20 .80
1163 @item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
1166 @item @emph{Fortran}:
1167 @multitable @columnfractions .20 .80
1168 @item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
1171 @item @emph{See also}:
1172 @ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
1174 @item @emph{Reference}:
1175 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.22.
1180 @node omp_get_schedule
1181 @section @code{omp_get_schedule} -- Obtain the runtime scheduling method
1183 @item @emph{Description}:
1184 Obtain the runtime scheduling method. The @var{kind} argument will be
1185 set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
1186 @code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
1187 @var{modifier}, is set to the chunk size.
1190 @multitable @columnfractions .20 .80
1191 @item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *modifier);}
1194 @item @emph{Fortran}:
1195 @multitable @columnfractions .20 .80
1196 @item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, modifier)}
1197 @item @tab @code{integer(kind=omp_sched_kind) kind}
1198 @item @tab @code{integer modifier}
1201 @item @emph{See also}:
1202 @ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
1204 @item @emph{Reference}:
1205 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.13.
1210 @node omp_get_team_num
1211 @section @code{omp_get_team_num} -- Get team number
1213 @item @emph{Description}:
1214 Returns the team number of the calling thread.
1217 @multitable @columnfractions .20 .80
1218 @item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
1221 @item @emph{Fortran}:
1222 @multitable @columnfractions .20 .80
1223 @item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
1226 @item @emph{Reference}:
1227 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.27.
1232 @node omp_get_team_size
1233 @section @code{omp_get_team_size} -- Number of threads in a team
1235 @item @emph{Description}:
1236 This function returns the number of threads in a thread team to which
1237 either the current thread or its ancestor belongs. For values of @var{level}
1238 outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
1239 1 is returned, and for @code{omp_get_level}, the result is identical
1240 to @code{omp_get_num_threads}.
1243 @multitable @columnfractions .20 .80
1244 @item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
1247 @item @emph{Fortran}:
1248 @multitable @columnfractions .20 .80
1249 @item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
1250 @item @tab @code{integer level}
1253 @item @emph{See also}:
1254 @ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
1256 @item @emph{Reference}:
1257 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.19.
1262 @node omp_get_thread_limit
1263 @section @code{omp_get_thread_limit} -- Maximum number of threads
1265 @item @emph{Description}:
1266 Return the maximum number of threads of the program.
1269 @multitable @columnfractions .20 .80
1270 @item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
1273 @item @emph{Fortran}:
1274 @multitable @columnfractions .20 .80
1275 @item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
1278 @item @emph{See also}:
1279 @ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
1281 @item @emph{Reference}:
1282 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.14.
1287 @node omp_get_thread_num
1288 @section @code{omp_get_thread_num} -- Current thread ID
1290 @item @emph{Description}:
1291 Returns a unique thread identification number within the current team.
1292 In a sequential parts of the program, @code{omp_get_thread_num}
1293 always returns 0. In parallel regions the return value varies
1294 from 0 to @code{omp_get_num_threads}-1 inclusive. The return
1295 value of the master thread of a team is always 0.
1298 @multitable @columnfractions .20 .80
1299 @item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
1302 @item @emph{Fortran}:
1303 @multitable @columnfractions .20 .80
1304 @item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
1307 @item @emph{See also}:
1308 @ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
1310 @item @emph{Reference}:
1311 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.4.
1316 @node omp_in_parallel
1317 @section @code{omp_in_parallel} -- Whether a parallel region is active
1319 @item @emph{Description}:
1320 This function returns @code{true} if currently running in parallel,
1321 @code{false} otherwise. Here, @code{true} and @code{false} represent
1322 their language-specific counterparts.
1325 @multitable @columnfractions .20 .80
1326 @item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
1329 @item @emph{Fortran}:
1330 @multitable @columnfractions .20 .80
1331 @item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
1334 @item @emph{Reference}:
1335 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.6.
1340 @section @code{omp_in_final} -- Whether in final or included task region
1342 @item @emph{Description}:
1343 This function returns @code{true} if currently running in a final
1344 or included task region, @code{false} otherwise. Here, @code{true}
1345 and @code{false} represent their language-specific counterparts.
1348 @multitable @columnfractions .20 .80
1349 @item @emph{Prototype}: @tab @code{int omp_in_final(void);}
1352 @item @emph{Fortran}:
1353 @multitable @columnfractions .20 .80
1354 @item @emph{Interface}: @tab @code{logical function omp_in_final()}
1357 @item @emph{Reference}:
1358 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.21.
1363 @node omp_is_initial_device
1364 @section @code{omp_is_initial_device} -- Whether executing on the host device
1366 @item @emph{Description}:
1367 This function returns @code{true} if currently running on the host device,
1368 @code{false} otherwise. Here, @code{true} and @code{false} represent
1369 their language-specific counterparts.
1372 @multitable @columnfractions .20 .80
1373 @item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
1376 @item @emph{Fortran}:
1377 @multitable @columnfractions .20 .80
1378 @item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
1381 @item @emph{Reference}:
1382 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.28.
1387 @node omp_set_default_device
1388 @section @code{omp_set_default_device} -- Set the default device for target regions
1390 @item @emph{Description}:
1391 Set the default device for target regions without device clause. The argument
1392 shall be a nonnegative device number.
1395 @multitable @columnfractions .20 .80
1396 @item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
1399 @item @emph{Fortran}:
1400 @multitable @columnfractions .20 .80
1401 @item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
1402 @item @tab @code{integer device_num}
1405 @item @emph{See also}:
1406 @ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
1408 @item @emph{Reference}:
1409 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.23.
1414 @node omp_set_dynamic
1415 @section @code{omp_set_dynamic} -- Enable/disable dynamic teams
1417 @item @emph{Description}:
1418 Enable or disable the dynamic adjustment of the number of threads
1419 within a team. The function takes the language-specific equivalent
1420 of @code{true} and @code{false}, where @code{true} enables dynamic
1421 adjustment of team sizes and @code{false} disables it.
1424 @multitable @columnfractions .20 .80
1425 @item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
1428 @item @emph{Fortran}:
1429 @multitable @columnfractions .20 .80
1430 @item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
1431 @item @tab @code{logical, intent(in) :: dynamic_threads}
1434 @item @emph{See also}:
1435 @ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
1437 @item @emph{Reference}:
1438 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.7.
1443 @node omp_set_max_active_levels
1444 @section @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
1446 @item @emph{Description}:
1447 This function limits the maximum allowed number of nested, active
1451 @multitable @columnfractions .20 .80
1452 @item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
1455 @item @emph{Fortran}:
1456 @multitable @columnfractions .20 .80
1457 @item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
1458 @item @tab @code{integer max_levels}
1461 @item @emph{See also}:
1462 @ref{omp_get_max_active_levels}, @ref{omp_get_active_level}
1464 @item @emph{Reference}:
1465 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.15.
1470 @node omp_set_nested
1471 @section @code{omp_set_nested} -- Enable/disable nested parallel regions
1473 @item @emph{Description}:
1474 Enable or disable nested parallel regions, i.e., whether team members
1475 are allowed to create new teams. The function takes the language-specific
1476 equivalent of @code{true} and @code{false}, where @code{true} enables
1477 dynamic adjustment of team sizes and @code{false} disables it.
1480 @multitable @columnfractions .20 .80
1481 @item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
1484 @item @emph{Fortran}:
1485 @multitable @columnfractions .20 .80
1486 @item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
1487 @item @tab @code{logical, intent(in) :: nested}
1490 @item @emph{See also}:
1491 @ref{OMP_NESTED}, @ref{omp_get_nested}
1493 @item @emph{Reference}:
1494 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.10.
1499 @node omp_set_num_threads
1500 @section @code{omp_set_num_threads} -- Set upper team size limit
1502 @item @emph{Description}:
1503 Specifies the number of threads used by default in subsequent parallel
1504 sections, if those do not specify a @code{num_threads} clause. The
1505 argument of @code{omp_set_num_threads} shall be a positive integer.
1508 @multitable @columnfractions .20 .80
1509 @item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
1512 @item @emph{Fortran}:
1513 @multitable @columnfractions .20 .80
1514 @item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
1515 @item @tab @code{integer, intent(in) :: num_threads}
1518 @item @emph{See also}:
1519 @ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
1521 @item @emph{Reference}:
1522 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.1.
1527 @node omp_set_schedule
1528 @section @code{omp_set_schedule} -- Set the runtime scheduling method
1530 @item @emph{Description}:
1531 Sets the runtime scheduling method. The @var{kind} argument can have the
1532 value @code{omp_sched_static}, @code{omp_sched_dynamic},
1533 @code{omp_sched_guided} or @code{omp_sched_auto}. Except for
1534 @code{omp_sched_auto}, the chunk size is set to the value of
1535 @var{modifier} if positive, or to the default value if zero or negative.
1536 For @code{omp_sched_auto} the @var{modifier} argument is ignored.
1539 @multitable @columnfractions .20 .80
1540 @item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int modifier);}
1543 @item @emph{Fortran}:
1544 @multitable @columnfractions .20 .80
1545 @item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, modifier)}
1546 @item @tab @code{integer(kind=omp_sched_kind) kind}
1547 @item @tab @code{integer modifier}
1550 @item @emph{See also}:
1551 @ref{omp_get_schedule}
1554 @item @emph{Reference}:
1555 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.12.
1561 @section @code{omp_init_lock} -- Initialize simple lock
1563 @item @emph{Description}:
1564 Initialize a simple lock. After initialization, the lock is in
1568 @multitable @columnfractions .20 .80
1569 @item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
1572 @item @emph{Fortran}:
1573 @multitable @columnfractions .20 .80
1574 @item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
1575 @item @tab @code{integer(omp_lock_kind), intent(out) :: svar}
1578 @item @emph{See also}:
1579 @ref{omp_destroy_lock}
1581 @item @emph{Reference}:
1582 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.1.
1588 @section @code{omp_set_lock} -- Wait for and set simple lock
1590 @item @emph{Description}:
1591 Before setting a simple lock, the lock variable must be initialized by
1592 @code{omp_init_lock}. The calling thread is blocked until the lock
1593 is available. If the lock is already held by the current thread,
1597 @multitable @columnfractions .20 .80
1598 @item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
1601 @item @emph{Fortran}:
1602 @multitable @columnfractions .20 .80
1603 @item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
1604 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1607 @item @emph{See also}:
1608 @ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
1610 @item @emph{Reference}:
1611 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.3.
1617 @section @code{omp_test_lock} -- Test and set simple lock if available
1619 @item @emph{Description}:
1620 Before setting a simple lock, the lock variable must be initialized by
1621 @code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
1622 does not block if the lock is not available. This function returns
1623 @code{true} upon success, @code{false} otherwise. Here, @code{true} and
1624 @code{false} represent their language-specific counterparts.
1627 @multitable @columnfractions .20 .80
1628 @item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
1631 @item @emph{Fortran}:
1632 @multitable @columnfractions .20 .80
1633 @item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
1634 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1637 @item @emph{See also}:
1638 @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1640 @item @emph{Reference}:
1641 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.5.
1646 @node omp_unset_lock
1647 @section @code{omp_unset_lock} -- Unset simple lock
1649 @item @emph{Description}:
1650 A simple lock about to be unset must have been locked by @code{omp_set_lock}
1651 or @code{omp_test_lock} before. In addition, the lock must be held by the
1652 thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
1653 or more threads attempted to set the lock before, one of them is chosen to,
1654 again, set the lock to itself.
1657 @multitable @columnfractions .20 .80
1658 @item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
1661 @item @emph{Fortran}:
1662 @multitable @columnfractions .20 .80
1663 @item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
1664 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1667 @item @emph{See also}:
1668 @ref{omp_set_lock}, @ref{omp_test_lock}
1670 @item @emph{Reference}:
1671 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.4.
1676 @node omp_destroy_lock
1677 @section @code{omp_destroy_lock} -- Destroy simple lock
1679 @item @emph{Description}:
1680 Destroy a simple lock. In order to be destroyed, a simple lock must be
1681 in the unlocked state.
1684 @multitable @columnfractions .20 .80
1685 @item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
1688 @item @emph{Fortran}:
1689 @multitable @columnfractions .20 .80
1690 @item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
1691 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1694 @item @emph{See also}:
1697 @item @emph{Reference}:
1698 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.2.
1703 @node omp_init_nest_lock
1704 @section @code{omp_init_nest_lock} -- Initialize nested lock
1706 @item @emph{Description}:
1707 Initialize a nested lock. After initialization, the lock is in
1708 an unlocked state and the nesting count is set to zero.
1711 @multitable @columnfractions .20 .80
1712 @item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
1715 @item @emph{Fortran}:
1716 @multitable @columnfractions .20 .80
1717 @item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
1718 @item @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
1721 @item @emph{See also}:
1722 @ref{omp_destroy_nest_lock}
1724 @item @emph{Reference}:
1725 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.1.
1729 @node omp_set_nest_lock
1730 @section @code{omp_set_nest_lock} -- Wait for and set nested lock
1732 @item @emph{Description}:
1733 Before setting a nested lock, the lock variable must be initialized by
1734 @code{omp_init_nest_lock}. The calling thread is blocked until the lock
1735 is available. If the lock is already held by the current thread, the
1736 nesting count for the lock is incremented.
1739 @multitable @columnfractions .20 .80
1740 @item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
1743 @item @emph{Fortran}:
1744 @multitable @columnfractions .20 .80
1745 @item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
1746 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1749 @item @emph{See also}:
1750 @ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
1752 @item @emph{Reference}:
1753 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.3.
1758 @node omp_test_nest_lock
1759 @section @code{omp_test_nest_lock} -- Test and set nested lock if available
1761 @item @emph{Description}:
1762 Before setting a nested lock, the lock variable must be initialized by
1763 @code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
1764 @code{omp_test_nest_lock} does not block if the lock is not available.
1765 If the lock is already held by the current thread, the new nesting count
1766 is returned. Otherwise, the return value equals zero.
1769 @multitable @columnfractions .20 .80
1770 @item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
1773 @item @emph{Fortran}:
1774 @multitable @columnfractions .20 .80
1775 @item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
1776 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1780 @item @emph{See also}:
1781 @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1783 @item @emph{Reference}:
1784 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.5.
1789 @node omp_unset_nest_lock
1790 @section @code{omp_unset_nest_lock} -- Unset nested lock
1792 @item @emph{Description}:
1793 A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
1794 or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
1795 thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
1796 lock becomes unlocked. If one ore more threads attempted to set the lock before,
1797 one of them is chosen to, again, set the lock to itself.
1800 @multitable @columnfractions .20 .80
1801 @item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
1804 @item @emph{Fortran}:
1805 @multitable @columnfractions .20 .80
1806 @item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
1807 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1810 @item @emph{See also}:
1811 @ref{omp_set_nest_lock}
1813 @item @emph{Reference}:
1814 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.4.
1819 @node omp_destroy_nest_lock
1820 @section @code{omp_destroy_nest_lock} -- Destroy nested lock
1822 @item @emph{Description}:
1823 Destroy a nested lock. In order to be destroyed, a nested lock must be
1824 in the unlocked state and its nesting count must equal zero.
1827 @multitable @columnfractions .20 .80
1828 @item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
1831 @item @emph{Fortran}:
1832 @multitable @columnfractions .20 .80
1833 @item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
1834 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1837 @item @emph{See also}:
1840 @item @emph{Reference}:
1841 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.2.
1847 @section @code{omp_get_wtick} -- Get timer precision
1849 @item @emph{Description}:
1850 Gets the timer precision, i.e., the number of seconds between two
1851 successive clock ticks.
1854 @multitable @columnfractions .20 .80
1855 @item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
1858 @item @emph{Fortran}:
1859 @multitable @columnfractions .20 .80
1860 @item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
1863 @item @emph{See also}:
1866 @item @emph{Reference}:
1867 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.4.2.
1873 @section @code{omp_get_wtime} -- Elapsed wall clock time
1875 @item @emph{Description}:
1876 Elapsed wall clock time in seconds. The time is measured per thread, no
1877 guarantee can be made that two distinct threads measure the same time.
1878 Time is measured from some "time in the past", which is an arbitrary time
1879 guaranteed not to change during the execution of the program.
1882 @multitable @columnfractions .20 .80
1883 @item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
1886 @item @emph{Fortran}:
1887 @multitable @columnfractions .20 .80
1888 @item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
1891 @item @emph{See also}:
1894 @item @emph{Reference}:
1895 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.4.1.
1900 @c ---------------------------------------------------------------------
1901 @c OpenMP Environment Variables
1902 @c ---------------------------------------------------------------------
1904 @node Environment Variables
1905 @chapter OpenMP Environment Variables
1907 The environment variables which beginning with @env{OMP_} are defined by
1908 section 4 of the OpenMP specification in version 4.0, while those
1909 beginning with @env{GOMP_} are GNU extensions.
1912 * OMP_CANCELLATION:: Set whether cancellation is activated
1913 * OMP_DISPLAY_ENV:: Show OpenMP version and environment variables
1914 * OMP_DEFAULT_DEVICE:: Set the device used in target regions
1915 * OMP_DYNAMIC:: Dynamic adjustment of threads
1916 * OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
1917 * OMP_NESTED:: Nested parallel regions
1918 * OMP_NUM_THREADS:: Specifies the number of threads to use
1919 * OMP_PROC_BIND:: Whether theads may be moved between CPUs
1920 * OMP_PLACES:: Specifies on which CPUs the theads should be placed
1921 * OMP_STACKSIZE:: Set default thread stack size
1922 * OMP_SCHEDULE:: How threads are scheduled
1923 * OMP_THREAD_LIMIT:: Set the maximum number of threads
1924 * OMP_WAIT_POLICY:: How waiting threads are handled
1925 * GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
1926 * GOMP_DEBUG:: Enable debugging output
1927 * GOMP_STACKSIZE:: Set default thread stack size
1928 * GOMP_SPINCOUNT:: Set the busy-wait spin count
1932 @node OMP_CANCELLATION
1933 @section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
1934 @cindex Environment Variable
1936 @item @emph{Description}:
1937 If set to @code{TRUE}, the cancellation is activated. If set to @code{FALSE} or
1938 if unset, cancellation is disabled and the @code{cancel} construct is ignored.
1940 @item @emph{See also}:
1941 @ref{omp_get_cancellation}
1943 @item @emph{Reference}:
1944 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.11
1949 @node OMP_DISPLAY_ENV
1950 @section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
1951 @cindex Environment Variable
1953 @item @emph{Description}:
1954 If set to @code{TRUE}, the OpenMP version number and the values
1955 associated with the OpenMP environment variables are printed to @code{stderr}.
1956 If set to @code{VERBOSE}, it additionally shows the value of the environment
1957 variables which are GNU extensions. If undefined or set to @code{FALSE},
1958 this information will not be shown.
1961 @item @emph{Reference}:
1962 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.12
1967 @node OMP_DEFAULT_DEVICE
1968 @section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
1969 @cindex Environment Variable
1971 @item @emph{Description}:
1972 Set to choose the device which is used in a @code{target} region, unless the
1973 value is overridden by @code{omp_set_default_device} or by a @code{device}
1974 clause. The value shall be the nonnegative device number. If no device with
1975 the given device number exists, the code is executed on the host. If unset,
1976 device number 0 will be used.
1979 @item @emph{See also}:
1980 @ref{omp_get_default_device}, @ref{omp_set_default_device},
1982 @item @emph{Reference}:
1983 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.11
1989 @section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
1990 @cindex Environment Variable
1992 @item @emph{Description}:
1993 Enable or disable the dynamic adjustment of the number of threads
1994 within a team. The value of this environment variable shall be
1995 @code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
1996 disabled by default.
1998 @item @emph{See also}:
1999 @ref{omp_set_dynamic}
2001 @item @emph{Reference}:
2002 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.3
2007 @node OMP_MAX_ACTIVE_LEVELS
2008 @section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
2009 @cindex Environment Variable
2011 @item @emph{Description}:
2012 Specifies the initial value for the maximum number of nested parallel
2013 regions. The value of this variable shall be a positive integer.
2014 If undefined, the number of active levels is unlimited.
2016 @item @emph{See also}:
2017 @ref{omp_set_max_active_levels}
2019 @item @emph{Reference}:
2020 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.9
2026 @section @env{OMP_NESTED} -- Nested parallel regions
2027 @cindex Environment Variable
2028 @cindex Implementation specific setting
2030 @item @emph{Description}:
2031 Enable or disable nested parallel regions, i.e., whether team members
2032 are allowed to create new teams. The value of this environment variable
2033 shall be @code{TRUE} or @code{FALSE}. If undefined, nested parallel
2034 regions are disabled by default.
2036 @item @emph{See also}:
2037 @ref{omp_set_nested}
2039 @item @emph{Reference}:
2040 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.6
2045 @node OMP_NUM_THREADS
2046 @section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
2047 @cindex Environment Variable
2048 @cindex Implementation specific setting
2050 @item @emph{Description}:
2051 Specifies the default number of threads to use in parallel regions. The
2052 value of this variable shall be a comma-separated list of positive integers;
2053 the value specified the number of threads to use for the corresponding nested
2054 level. If undefined one thread per CPU is used.
2056 @item @emph{See also}:
2057 @ref{omp_set_num_threads}
2059 @item @emph{Reference}:
2060 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.2
2066 @section @env{OMP_PROC_BIND} -- Whether theads may be moved between CPUs
2067 @cindex Environment Variable
2069 @item @emph{Description}:
2070 Specifies whether threads may be moved between processors. If set to
2071 @code{TRUE}, OpenMP theads should not be moved; if set to @code{FALSE}
2072 they may be moved. Alternatively, a comma separated list with the
2073 values @code{MASTER}, @code{CLOSE} and @code{SPREAD} can be used to specify
2074 the thread affinity policy for the corresponding nesting level. With
2075 @code{MASTER} the worker threads are in the same place partition as the
2076 master thread. With @code{CLOSE} those are kept close to the master thread
2077 in contiguous place partitions. And with @code{SPREAD} a sparse distribution
2078 across the place partitions is used.
2080 When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
2081 @env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
2083 @item @emph{See also}:
2084 @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind}
2086 @item @emph{Reference}:
2087 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.4
2093 @section @env{OMP_PLACES} -- Specifies on which CPUs the theads should be placed
2094 @cindex Environment Variable
2096 @item @emph{Description}:
2097 The thread placement can be either specified using an abstract name or by an
2098 explicit list of the places. The abstract names @code{threads}, @code{cores}
2099 and @code{sockets} can be optionally followed by a positive number in
2100 parentheses, which denotes the how many places shall be created. With
2101 @code{threads} each place corresponds to a single hardware thread; @code{cores}
2102 to a single core with the corresponding number of hardware threads; and with
2103 @code{sockets} the place corresponds to a single socket. The resulting
2104 placement can be shown by setting the @env{OMP_DISPLAY_ENV} environment
2107 Alternatively, the placement can be specified explicitly as comma-separated
2108 list of places. A place is specified by set of nonnegative numbers in curly
2109 braces, denoting the denoting the hardware threads. The hardware threads
2110 belonging to a place can either be specified as comma-separated list of
2111 nonnegative thread numbers or using an interval. Multiple places can also be
2112 either specified by a comma-separated list of places or by an interval. To
2113 specify an interval, a colon followed by the count is placed after after
2114 the hardware thread number or the place. Optionally, the length can be
2115 followed by a colon and the stride number -- otherwise a unit stride is
2116 assumed. For instance, the following specifies the same places list:
2117 @code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
2118 @code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
2120 If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
2121 @env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
2122 between CPUs following no placement policy.
2124 @item @emph{See also}:
2125 @ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
2126 @ref{OMP_DISPLAY_ENV}
2128 @item @emph{Reference}:
2129 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.5
2135 @section @env{OMP_STACKSIZE} -- Set default thread stack size
2136 @cindex Environment Variable
2138 @item @emph{Description}:
2139 Set the default thread stack size in kilobytes, unless the number
2140 is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
2141 case the size is, respectively, in bytes, kilobytes, megabytes
2142 or gigabytes. This is different from @code{pthread_attr_setstacksize}
2143 which gets the number of bytes as an argument. If the stack size cannot
2144 be set due to system constraints, an error is reported and the initial
2145 stack size is left unchanged. If undefined, the stack size is system
2148 @item @emph{Reference}:
2149 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.7
2155 @section @env{OMP_SCHEDULE} -- How threads are scheduled
2156 @cindex Environment Variable
2157 @cindex Implementation specific setting
2159 @item @emph{Description}:
2160 Allows to specify @code{schedule type} and @code{chunk size}.
2161 The value of the variable shall have the form: @code{type[,chunk]} where
2162 @code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
2163 The optional @code{chunk} size shall be a positive integer. If undefined,
2164 dynamic scheduling and a chunk size of 1 is used.
2166 @item @emph{See also}:
2167 @ref{omp_set_schedule}
2169 @item @emph{Reference}:
2170 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Sections 2.7.1 and 4.1
2175 @node OMP_THREAD_LIMIT
2176 @section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
2177 @cindex Environment Variable
2179 @item @emph{Description}:
2180 Specifies the number of threads to use for the whole program. The
2181 value of this variable shall be a positive integer. If undefined,
2182 the number of threads is not limited.
2184 @item @emph{See also}:
2185 @ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
2187 @item @emph{Reference}:
2188 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.10
2193 @node OMP_WAIT_POLICY
2194 @section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
2195 @cindex Environment Variable
2197 @item @emph{Description}:
2198 Specifies whether waiting threads should be active or passive. If
2199 the value is @code{PASSIVE}, waiting threads should not consume CPU
2200 power while waiting; while the value is @code{ACTIVE} specifies that
2201 they should. If undefined, threads wait actively for a short time
2202 before waiting passively.
2204 @item @emph{See also}:
2205 @ref{GOMP_SPINCOUNT}
2207 @item @emph{Reference}:
2208 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.8
2213 @node GOMP_CPU_AFFINITY
2214 @section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
2215 @cindex Environment Variable
2217 @item @emph{Description}:
2218 Binds threads to specific CPUs. The variable should contain a space-separated
2219 or comma-separated list of CPUs. This list may contain different kinds of
2220 entries: either single CPU numbers in any order, a range of CPUs (M-N)
2221 or a range with some stride (M-N:S). CPU numbers are zero based. For example,
2222 @code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
2223 to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
2224 CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
2225 and 14 respectively and then start assigning back from the beginning of
2226 the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
2228 There is no libgomp library routine to determine whether a CPU affinity
2229 specification is in effect. As a workaround, language-specific library
2230 functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
2231 Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
2232 environment variable. A defined CPU affinity on startup cannot be changed
2233 or disabled during the runtime of the application.
2235 If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
2236 @env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
2237 @env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
2238 @code{FALSE}, the host system will handle the assignment of threads to CPUs.
2240 @item @emph{See also}:
2241 @ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
2247 @section @env{GOMP_DEBUG} -- Enable debugging output
2248 @cindex Environment Variable
2250 @item @emph{Description}:
2251 Enable debugging output. The variable should be set to @code{0}
2252 (disabled, also the default if not set), or @code{1} (enabled).
2254 If enabled, some debugging output will be printed during execution.
2255 This is currently not specified in more detail, and subject to change.
2260 @node GOMP_STACKSIZE
2261 @section @env{GOMP_STACKSIZE} -- Set default thread stack size
2262 @cindex Environment Variable
2263 @cindex Implementation specific setting
2265 @item @emph{Description}:
2266 Set the default thread stack size in kilobytes. This is different from
2267 @code{pthread_attr_setstacksize} which gets the number of bytes as an
2268 argument. If the stack size cannot be set due to system constraints, an
2269 error is reported and the initial stack size is left unchanged. If undefined,
2270 the stack size is system dependent.
2272 @item @emph{See also}:
2275 @item @emph{Reference}:
2276 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
2277 GCC Patches Mailinglist},
2278 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
2279 GCC Patches Mailinglist}
2284 @node GOMP_SPINCOUNT
2285 @section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
2286 @cindex Environment Variable
2287 @cindex Implementation specific setting
2289 @item @emph{Description}:
2290 Determines how long a threads waits actively with consuming CPU power
2291 before waiting passively without consuming CPU power. The value may be
2292 either @code{INFINITE}, @code{INFINITY} to always wait actively or an
2293 integer which gives the number of spins of the busy-wait loop. The
2294 integer may optionally be followed by the following suffixes acting
2295 as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
2296 million), @code{G} (giga, billion), or @code{T} (tera, trillion).
2297 If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
2298 300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
2299 30 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
2300 If there are more OpenMP threads than available CPUs, 1000 and 100
2301 spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
2302 undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
2303 or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
2305 @item @emph{See also}:
2306 @ref{OMP_WAIT_POLICY}
2311 @c ---------------------------------------------------------------------
2313 @c ---------------------------------------------------------------------
2315 @node The libgomp ABI
2316 @chapter The libgomp ABI
2318 The following sections present notes on the external ABI as
2319 presented by libgomp. Only maintainers should need them.
2322 * Implementing MASTER construct::
2323 * Implementing CRITICAL construct::
2324 * Implementing ATOMIC construct::
2325 * Implementing FLUSH construct::
2326 * Implementing BARRIER construct::
2327 * Implementing THREADPRIVATE construct::
2328 * Implementing PRIVATE clause::
2329 * Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
2330 * Implementing REDUCTION clause::
2331 * Implementing PARALLEL construct::
2332 * Implementing FOR construct::
2333 * Implementing ORDERED construct::
2334 * Implementing SECTIONS construct::
2335 * Implementing SINGLE construct::
2336 * Implementing OpenACC's PARALLEL construct::
2340 @node Implementing MASTER construct
2341 @section Implementing MASTER construct
2344 if (omp_get_thread_num () == 0)
2348 Alternately, we generate two copies of the parallel subfunction
2349 and only include this in the version run by the master thread.
2350 Surely this is not worthwhile though...
2354 @node Implementing CRITICAL construct
2355 @section Implementing CRITICAL construct
2357 Without a specified name,
2360 void GOMP_critical_start (void);
2361 void GOMP_critical_end (void);
2364 so that we don't get COPY relocations from libgomp to the main
2367 With a specified name, use omp_set_lock and omp_unset_lock with
2368 name being transformed into a variable declared like
2371 omp_lock_t gomp_critical_user_<name> __attribute__((common))
2374 Ideally the ABI would specify that all zero is a valid unlocked
2375 state, and so we wouldn't need to initialize this at
2380 @node Implementing ATOMIC construct
2381 @section Implementing ATOMIC construct
2383 The target should implement the @code{__sync} builtins.
2385 Failing that we could add
2388 void GOMP_atomic_enter (void)
2389 void GOMP_atomic_exit (void)
2392 which reuses the regular lock code, but with yet another lock
2393 object private to the library.
2397 @node Implementing FLUSH construct
2398 @section Implementing FLUSH construct
2400 Expands to the @code{__sync_synchronize} builtin.
2404 @node Implementing BARRIER construct
2405 @section Implementing BARRIER construct
2408 void GOMP_barrier (void)
2412 @node Implementing THREADPRIVATE construct
2413 @section Implementing THREADPRIVATE construct
2415 In _most_ cases we can map this directly to @code{__thread}. Except
2416 that OMP allows constructors for C++ objects. We can either
2417 refuse to support this (how often is it used?) or we can
2418 implement something akin to .ctors.
2420 Even more ideally, this ctor feature is handled by extensions
2421 to the main pthreads library. Failing that, we can have a set
2422 of entry points to register ctor functions to be called.
2426 @node Implementing PRIVATE clause
2427 @section Implementing PRIVATE clause
2429 In association with a PARALLEL, or within the lexical extent
2430 of a PARALLEL block, the variable becomes a local variable in
2431 the parallel subfunction.
2433 In association with FOR or SECTIONS blocks, create a new
2434 automatic variable within the current function. This preserves
2435 the semantic of new variable creation.
2439 @node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
2440 @section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
2442 This seems simple enough for PARALLEL blocks. Create a private
2443 struct for communicating between the parent and subfunction.
2444 In the parent, copy in values for scalar and "small" structs;
2445 copy in addresses for others TREE_ADDRESSABLE types. In the
2446 subfunction, copy the value into the local variable.
2448 It is not clear what to do with bare FOR or SECTION blocks.
2449 The only thing I can figure is that we do something like:
2452 #pragma omp for firstprivate(x) lastprivate(y)
2453 for (int i = 0; i < n; ++i)
2470 where the "x=x" and "y=y" assignments actually have different
2471 uids for the two variables, i.e. not something you could write
2472 directly in C. Presumably this only makes sense if the "outer"
2473 x and y are global variables.
2475 COPYPRIVATE would work the same way, except the structure
2476 broadcast would have to happen via SINGLE machinery instead.
2480 @node Implementing REDUCTION clause
2481 @section Implementing REDUCTION clause
2483 The private struct mentioned in the previous section should have
2484 a pointer to an array of the type of the variable, indexed by the
2485 thread's @var{team_id}. The thread stores its final value into the
2486 array, and after the barrier, the master thread iterates over the
2487 array to collect the values.
2490 @node Implementing PARALLEL construct
2491 @section Implementing PARALLEL construct
2494 #pragma omp parallel
2503 void subfunction (void *data)
2510 GOMP_parallel_start (subfunction, &data, num_threads);
2511 subfunction (&data);
2512 GOMP_parallel_end ();
2516 void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
2519 The @var{FN} argument is the subfunction to be run in parallel.
2521 The @var{DATA} argument is a pointer to a structure used to
2522 communicate data in and out of the subfunction, as discussed
2523 above with respect to FIRSTPRIVATE et al.
2525 The @var{NUM_THREADS} argument is 1 if an IF clause is present
2526 and false, or the value of the NUM_THREADS clause, if
2529 The function needs to create the appropriate number of
2530 threads and/or launch them from the dock. It needs to
2531 create the team structure and assign team ids.
2534 void GOMP_parallel_end (void)
2537 Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
2541 @node Implementing FOR construct
2542 @section Implementing FOR construct
2545 #pragma omp parallel for
2546 for (i = lb; i <= ub; i++)
2553 void subfunction (void *data)
2556 while (GOMP_loop_static_next (&_s0, &_e0))
2559 for (i = _s0; i < _e1; i++)
2562 GOMP_loop_end_nowait ();
2565 GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
2567 GOMP_parallel_end ();
2571 #pragma omp for schedule(runtime)
2572 for (i = 0; i < n; i++)
2581 if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
2584 for (i = _s0, i < _e0; i++)
2586 @} while (GOMP_loop_runtime_next (&_s0, _&e0));
2591 Note that while it looks like there is trickiness to propagating
2592 a non-constant STEP, there isn't really. We're explicitly allowed
2593 to evaluate it as many times as we want, and any variables involved
2594 should automatically be handled as PRIVATE or SHARED like any other
2595 variables. So the expression should remain evaluable in the
2596 subfunction. We can also pull it into a local variable if we like,
2597 but since its supposed to remain unchanged, we can also not if we like.
2599 If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
2600 able to get away with no work-sharing context at all, since we can
2601 simply perform the arithmetic directly in each thread to divide up
2602 the iterations. Which would mean that we wouldn't need to call any
2605 There are separate routines for handling loops with an ORDERED
2606 clause. Bookkeeping for that is non-trivial...
2610 @node Implementing ORDERED construct
2611 @section Implementing ORDERED construct
2614 void GOMP_ordered_start (void)
2615 void GOMP_ordered_end (void)
2620 @node Implementing SECTIONS construct
2621 @section Implementing SECTIONS construct
2626 #pragma omp sections
2640 for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
2657 @node Implementing SINGLE construct
2658 @section Implementing SINGLE construct
2672 if (GOMP_single_start ())
2680 #pragma omp single copyprivate(x)
2687 datap = GOMP_single_copy_start ();
2692 GOMP_single_copy_end (&data);
2701 @node Implementing OpenACC's PARALLEL construct
2702 @section Implementing OpenACC's PARALLEL construct
2705 void GOACC_parallel ()
2710 @c ---------------------------------------------------------------------
2712 @c ---------------------------------------------------------------------
2714 @node Reporting Bugs
2715 @chapter Reporting Bugs
2717 Bugs in the GNU Offloading and Multi Processing Runtime Library should
2718 be reported via @uref{http://gcc.gnu.org/bugzilla/, Bugzilla}. Please add
2719 "openacc", or "openmp", or both to the keywords field in the bug
2720 report, as appropriate.
2724 @c ---------------------------------------------------------------------
2725 @c GNU General Public License
2726 @c ---------------------------------------------------------------------
2728 @include gpl_v3.texi
2732 @c ---------------------------------------------------------------------
2733 @c GNU Free Documentation License
2734 @c ---------------------------------------------------------------------
2740 @c ---------------------------------------------------------------------
2741 @c Funding Free Software
2742 @c ---------------------------------------------------------------------
2744 @include funding.texi
2746 @c ---------------------------------------------------------------------
2748 @c ---------------------------------------------------------------------
2751 @unnumbered Library Index