1 \input texinfo @c -*-texinfo-*-
4 @setfilename libgomp.info
10 Copyright @copyright{} 2006-2014 Free Software Foundation, Inc.
12 Permission is granted to copy, distribute and/or modify this document
13 under the terms of the GNU Free Documentation License, Version 1.3 or
14 any later version published by the Free Software Foundation; with the
15 Invariant Sections being ``Funding Free Software'', the Front-Cover
16 texts being (a) (see below), and with the Back-Cover Texts being (b)
17 (see below). A copy of the license is included in the section entitled
18 ``GNU Free Documentation License''.
20 (a) The FSF's Front-Cover Text is:
24 (b) The FSF's Back-Cover Text is:
26 You have freedom to copy and modify this GNU Manual, like GNU
27 software. Copies published by the Free Software Foundation raise
28 funds for GNU development.
32 @dircategory GNU Libraries
34 * libgomp: (libgomp). GNU Offloading and Multi Processing Runtime Library
37 This manual documents libgomp, the GNU Offloading and Multi Processing
38 Runtime library. This is the GNU implementation of the OpenMP and
39 OpenACC APIs for parallel and accelerator programming in C/C++ and
42 Published by the Free Software Foundation
43 51 Franklin Street, Fifth Floor
44 Boston, MA 02110-1301 USA
50 @setchapternewpage odd
53 @title The GNU OpenACC and OpenMP Implementation
55 @vskip 0pt plus 1filll
56 @comment For the @value{version-GCC} Version*
58 Published by the Free Software Foundation @*
59 51 Franklin Street, Fifth Floor@*
60 Boston, MA 02110-1301, USA@*
74 This manual documents the usage of libgomp, the GNU Offloading and
75 Multi Processing Runtime Library. This includes the GNU
76 implementation of the @uref{http://www.openmp.org, OpenMP} Application
77 Programming Interface (API) for multi-platform shared-memory parallel
78 programming in C/C++ and Fortran, and the GNU implementation of the
79 @uref{http://www.openacc.org/, OpenACC} Application Programming
80 Interface (API) for offloading of code to accelerator devices in C/C++
83 Originally, libgomp implemented the GNU OpenMP Runtime Library. Based
84 on this, support for OpenACC and offloading (both OpenACC and OpenMP
85 4's target construct) has been added later on, and the library's name
86 changed to GNU Offloading and Multi Processing Runtime Library.
90 @comment When you add a new menu item, please keep the right hand
91 @comment aligned to the same column. Do not use tabs. This provides
92 @comment better formatting.
95 * Enabling OpenACC:: How to enable OpenACC for your
97 * OpenACC Runtime Library Routines:: The OpenACC runtime application
98 programming interface.
99 * OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
100 environment variables.
101 * OpenACC Library Interoperability:: OpenACC library interoperability with the
102 NVIDIA CUBLAS library.
103 * Enabling OpenMP:: How to enable OpenMP for your
105 * OpenMP Runtime Library Routines: Runtime Library Routines.
106 The OpenMP runtime application programming
108 * OpenMP Environment Variables: Environment Variables.
109 Influencing OpenMP runtime behavior with
110 environment variables.
111 * The libgomp ABI:: Notes on the external libgomp ABI.
112 * Reporting Bugs:: How to report bugs.
113 * Copying:: GNU general public license says how you
114 can copy and share libgomp.
115 * GNU Free Documentation License:: How you can copy and share this manual.
116 * Funding:: How to help assure continued work for free
118 * Library Index:: Index of this documentation.
123 @c ---------------------------------------------------------------------
125 @c ---------------------------------------------------------------------
127 @node Enabling OpenACC
128 @chapter Enabling OpenACC
130 To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
131 flag @command{-fopenacc} must be specified. This enables the OpenACC directive
132 @code{#pragma acc} in C/C++ and @code{!$accp} directives in free form,
133 @code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
134 @code{!$} conditional compilation sentinels in free form and @code{c$},
135 @code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
136 arranges for automatic linking of the OpenACC runtime library
137 (@ref{OpenACC Runtime Library Routines}).
139 A complete description of all OpenACC directives accepted may be found in
140 the @uref{http://www.openacc.org/, OpenMP Application Programming
141 Interface} manual, version 2.0.
144 @c ---------------------------------------------------------------------
145 @c OpenACC Runtime Library Routines
146 @c ---------------------------------------------------------------------
148 @node OpenACC Runtime Library Routines
149 @chapter OpenACC Runtime Library Routines
151 The runtime routines described here are defined by section 3 of the OpenACC
152 specifications in version 2.0.
153 They have C linkage, and do not throw exceptions.
154 Generally, they are available only for the host, with the exception of
155 @code{acc_on_device}, which is available for both the host and the
159 * acc_get_num_devices:: Get number of devices for the given device type
160 * acc_set_device_type::
161 * acc_get_device_type::
162 * acc_set_device_num::
163 * acc_get_device_num::
166 * acc_on_device:: Whether executing on a particular device
170 * acc_present_or_copyin::
172 * acc_present_or_create::
175 * acc_update_device::
182 * acc_memcpy_to_device::
183 * acc_memcpy_from_device::
185 API routines for target platforms.
187 * acc_get_current_cuda_device::
188 * acc_get_current_cuda_context::
189 * acc_get_cuda_stream::
190 * acc_set_cuda_stream::
195 @node acc_get_num_devices
196 @section @code{acc_get_num_devices} -- Get number of devices for given device type
198 @item @emph{Description}
199 This routine returns a value indicating the
200 number of devices available for the given device type. It determines
201 the number of devices in a @emph{passive} manner. In other words, it
202 does not alter the state within the runtime environment aside from
203 possibly initializing an uninitialized device. This aspect allows
204 the routine to be called without concern for altering the interaction
205 with an attached accelerator device.
207 @item @emph{Reference}:
208 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
214 @node acc_set_device_type
215 @section @code{acc_set_device_type}
217 @item @emph{Reference}:
218 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
224 @node acc_get_device_type
225 @section @code{acc_get_device_type}
227 @item @emph{Reference}:
228 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
234 @node acc_set_device_num
235 @section @code{acc_set_device_num}
237 @item @emph{Reference}:
238 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
244 @node acc_get_device_num
245 @section @code{acc_get_device_num}
247 @item @emph{Reference}:
248 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
255 @section @code{acc_init}
257 @item @emph{Reference}:
258 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
265 @section @code{acc_shutdown}
267 @item @emph{Reference}:
268 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
275 @section @code{acc_on_device} -- Whether executing on a particular device
277 @item @emph{Description}:
278 This routine tells the program whether it is executing on a particular
279 device. Based on the argument passed, GCC tries to evaluate this to a
280 constant at compile time, but library functions are also provided, for
281 both the host and the acceleration device.
283 @item @emph{Reference}:
284 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
291 @section @code{acc_malloc}
293 @item @emph{Reference}:
294 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
301 @section @code{acc_free}
303 @item @emph{Reference}:
304 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
311 @section @code{acc_copyin}
313 @item @emph{Reference}:
314 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
320 @node acc_present_or_copyin
321 @section @code{acc_present_or_copyin}
323 @item @emph{Reference}:
324 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
331 @section @code{acc_create}
333 @item @emph{Reference}:
334 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
340 @node acc_present_or_create
341 @section @code{acc_present_or_create}
343 @item @emph{Reference}:
344 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
351 @section @code{acc_copyout}
353 @item @emph{Reference}:
354 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
361 @section @code{acc_delete}
363 @item @emph{Reference}:
364 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
370 @node acc_update_device
371 @section @code{acc_update_device}
373 @item @emph{Reference}:
374 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
380 @node acc_update_self
381 @section @code{acc_update_self}
383 @item @emph{Reference}:
384 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
391 @section @code{acc_map_data}
393 @item @emph{Reference}:
394 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
401 @section @code{acc_unmap_data}
403 @item @emph{Reference}:
404 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
411 @section @code{acc_deviceptr}
413 @item @emph{Reference}:
414 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
421 @section @code{acc_hostptr}
423 @item @emph{Reference}:
424 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
431 @section @code{acc_is_present}
433 @item @emph{Reference}:
434 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
440 @node acc_memcpy_to_device
441 @section @code{acc_memcpy_to_device}
443 @item @emph{Reference}:
444 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
450 @node acc_memcpy_from_device
451 @section @code{acc_memcpy_from_device}
453 @item @emph{Reference}:
454 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
460 @node acc_get_current_cuda_device
461 @section @code{acc_get_current_cuda_device}
463 @item @emph{Reference}:
464 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
470 @node acc_get_current_cuda_context
471 @section @code{acc_get_current_cuda_context}
473 @item @emph{Reference}:
474 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
480 @node acc_get_cuda_stream
481 @section @code{acc_get_cuda_stream}
483 @item @emph{Reference}:
484 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
490 @node acc_set_cuda_stream
491 @section @code{acc_set_cuda_stream}
493 @item @emph{Reference}:
494 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
500 @c ---------------------------------------------------------------------
501 @c OpenACC Environment Variables
502 @c ---------------------------------------------------------------------
504 @node OpenACC Environment Variables
505 @chapter OpenACC Environment Variables
507 The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
508 are defined by section 4 of the OpenACC specification in version 2.0.
509 The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
519 @node ACC_DEVICE_TYPE
520 @section @code{ACC_DEVICE_TYPE}
522 @item @emph{Reference}:
523 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
530 @section @code{ACC_DEVICE_NUM}
532 @item @emph{Reference}:
533 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
540 @section @code{GCC_ACC_NOTIFY}
542 @item @emph{Description}:
543 Print debug information pertaining to the accelerator.
547 @c ---------------------------------------------------------------------
548 @c OpenACC Library Interoperability
549 @c ---------------------------------------------------------------------
551 @node OpenACC Library Interoperability
552 @chapter OpenACC Library Interoperability
554 @section Introduction
556 As the OpenACC library is built using the CUDA Driver API, the question has
557 arisen on what impact does using the OpenACC library have on a program that
558 uses the Runtime library, or a library based on the Runtime library, e.g.,
559 CUBLAS@footnote{Seee section 2.26, "Interactions with the CUDA Driver API" in
560 "CUDA Runtime API", Version 5.5, July 2013 and section 2.27, "VDPAU
561 Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
562 July 2013, for additional information on library interoperability.}.
563 This chapter will describe the use cases and what changes are
564 required in order to use both the OpenACC library and the CUBLAS and Runtime
565 libraries within a program.
567 @section First invocation: NVIDIA CUBLAS library API
569 In this first use case (see below), a function in the CUBLAS library is called
570 prior to any of the functions in the OpenACC library. More specifically, the
571 function @code{cublasCreate()}.
573 When invoked, the function will initialize the library and allocate the
574 hardware resources on the host and the device on behalf of the caller. Once
575 the initialization and allocation has completed, a handle is returned to the
576 caller. The OpenACC library also requires initialization and allocation of
577 hardware resources. Since the CUBLAS library has already allocated the
578 hardware resources for the device, all that is left to do is to initialize
579 the OpenACC library and acquire the hardware resources on the host.
581 Prior to calling the OpenACC function that will initialize the library and
582 allocate the host hardware resources, one needs to acquire the device number
583 that was allocated during the call to @code{cublasCreate()}. The invoking of the
584 runtime library function @code{cudaGetDevice()} will accomplish this. Once
585 acquired, the device number is passed along with the device type as
586 parameters to the OpenACC library function @code{acc_set_device_num()}.
588 Once the call to @code{acc_set_device_num()} has completed, the OpenACC
589 library will be using the context that was created during the call to
590 @code{cublasCreate()}. In other words, both libraries will be sharing the
594 /* Create the handle */
595 s = cublasCreate(&h);
596 if (s != CUBLAS_STATUS_SUCCESS)
598 fprintf(stderr, "cublasCreate failed %d\n", s);
602 /* Get the device number */
603 e = cudaGetDevice(&dev);
604 if (e != cudaSuccess)
606 fprintf(stderr, "cudaGetDevice failed %d\n", e);
610 /* Initialize OpenACC library and use device 'dev' */
611 acc_set_device_num(dev, acc_device_nvidia);
616 @section First invocation: OpenACC library API
618 In this second use case (see below), a function in the OpenACC library is
619 called prior to any of the functions in the CUBLAS library. More specificially,
620 the function acc_set_device_num().
622 In the use case presented here, the function @code{acc_set_device_num()}
623 is used to both initialize the OpenACC library and allocate the hardware
624 resources on the host and the device. In the call to the function, the
625 call parameters specify which device to use, i.e., 'dev', and what device
626 type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
627 is but one method to initialize the OpenACC library and allocate the
628 appropriate hardware resources. Other methods are available through the
629 use of environment variables and these will be discussed in the next section.
631 Once the call to @code{acc_set_device_num()} has completed, other OpenACC
632 functions can be called as seen with multiple calls being made to
633 @code{acc_copyin()}. In addition, calls can be made to functions in the
634 CUBLAS library. In the use case a call to @code{cublasCreate()} is made
635 subsequent to the calls to @code{acc_copyin()}.
636 As seen in the previous use case, a call to @code{cublasCreate()} will
637 initialize the CUBLAS library and allocate the hardware resources on the
638 host and the device. However, since the device has already been allocated,
639 @code{cublasCreate()} will only initialize the CUBLAS library and allocate
640 the appropriate hardware resources on the host. The context that was created
641 as part of the OpenACC initialization will be shared with the CUBLAS library,
642 similarly to the first use case.
647 acc_set_device_num(dev, acc_device_nvidia);
649 /* Copy the first set to the device */
650 d_X = acc_copyin(&h_X[0], N * sizeof (float));
653 fprintf(stderr, "copyin error h_X\n");
657 /* Copy the second set to the device */
658 d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
661 fprintf(stderr, "copyin error h_Y1\n");
665 /* Create the handle */
666 s = cublasCreate(&h);
667 if (s != CUBLAS_STATUS_SUCCESS)
669 fprintf(stderr, "cublasCreate failed %d\n", s);
673 /* Perform saxpy using CUBLAS library function */
674 s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
675 if (s != CUBLAS_STATUS_SUCCESS)
677 fprintf(stderr, "cublasSaxpy failed %d\n", s);
681 /* Copy the results from the device */
682 acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
688 @section OpenACC library and environment variables
690 There are two environment variables associated with the OpenACC library that
691 may be used to control the device type and device number.
692 Namely, @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}. In the second
693 use case, the device type and device number were specified using
694 @code{acc_set_device_num()}. However, @env{ACC_DEVICE_TYPE} and
695 @env{ACC_DEVICE_NUM} could have been defined and the call to
696 @code{acc_set_device_num()} would be not be required. At the time of the
697 call to @code{acc_copyin()}, these two environment variables would be
698 sampled and their values used.
700 The use of the environment variables is only relevant when an OpenACC function
701 is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
702 is called prior to a call to an OpenACC function, then a call to
703 @code{acc_set_device_num()}, must be done@footnote{More complete information
704 about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
705 sections 4.1 and 4.2 of the “The OpenACC
706 Application Programming Interface”, Version 2.0, June, 2013.}.
710 @c ---------------------------------------------------------------------
712 @c ---------------------------------------------------------------------
714 @node Enabling OpenMP
715 @chapter Enabling OpenMP
717 To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
718 flag @command{-fopenmp} must be specified. This enables the OpenMP directive
719 @code{#pragma omp} in C/C++ and @code{!$omp} directives in free form,
720 @code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form,
721 @code{!$} conditional compilation sentinels in free form and @code{c$},
722 @code{*$} and @code{!$} sentinels in fixed form, for Fortran. The flag also
723 arranges for automatic linking of the OpenMP runtime library
724 (@ref{Runtime Library Routines}).
726 A complete description of all OpenMP directives accepted may be found in
727 the @uref{http://www.openmp.org, OpenMP Application Program Interface} manual,
731 @c ---------------------------------------------------------------------
732 @c OpenMP Runtime Library Routines
733 @c ---------------------------------------------------------------------
735 @node Runtime Library Routines
736 @chapter OpenMP Runtime Library Routines
738 The runtime routines described here are defined by Section 3 of the OpenMP
739 specification in version 4.0. The routines are structured in following
743 Control threads, processors and the parallel environment. They have C
744 linkage, and do not throw exceptions.
746 * omp_get_active_level:: Number of active parallel regions
747 * omp_get_ancestor_thread_num:: Ancestor thread ID
748 * omp_get_cancellation:: Whether cancellation support is enabled
749 * omp_get_default_device:: Get the default device for target regions
750 * omp_get_dynamic:: Dynamic teams setting
751 * omp_get_level:: Number of parallel regions
752 * omp_get_max_active_levels:: Maximum number of active regions
753 * omp_get_max_threads:: Maximum number of threads of parallel region
754 * omp_get_nested:: Nested parallel regions
755 * omp_get_num_devices:: Number of target devices
756 * omp_get_num_procs:: Number of processors online
757 * omp_get_num_teams:: Number of teams
758 * omp_get_num_threads:: Size of the active team
759 * omp_get_proc_bind:: Whether theads may be moved between CPUs
760 * omp_get_schedule:: Obtain the runtime scheduling method
761 * omp_get_team_num:: Get team number
762 * omp_get_team_size:: Number of threads in a team
763 * omp_get_thread_limit:: Maximum number of threads
764 * omp_get_thread_num:: Current thread ID
765 * omp_in_parallel:: Whether a parallel region is active
766 * omp_in_final:: Whether in final or included task region
767 * omp_is_initial_device:: Whether executing on the host device
768 * omp_set_default_device:: Set the default device for target regions
769 * omp_set_dynamic:: Enable/disable dynamic teams
770 * omp_set_max_active_levels:: Limits the number of active parallel regions
771 * omp_set_nested:: Enable/disable nested parallel regions
772 * omp_set_num_threads:: Set upper team size limit
773 * omp_set_schedule:: Set the runtime scheduling method
775 Initialize, set, test, unset and destroy simple and nested locks.
777 * omp_init_lock:: Initialize simple lock
778 * omp_set_lock:: Wait for and set simple lock
779 * omp_test_lock:: Test and set simple lock if available
780 * omp_unset_lock:: Unset simple lock
781 * omp_destroy_lock:: Destroy simple lock
782 * omp_init_nest_lock:: Initialize nested lock
783 * omp_set_nest_lock:: Wait for and set simple lock
784 * omp_test_nest_lock:: Test and set nested lock if available
785 * omp_unset_nest_lock:: Unset nested lock
786 * omp_destroy_nest_lock:: Destroy nested lock
788 Portable, thread-based, wall clock timer.
790 * omp_get_wtick:: Get timer precision.
791 * omp_get_wtime:: Elapsed wall clock time.
796 @node omp_get_active_level
797 @section @code{omp_get_active_level} -- Number of parallel regions
799 @item @emph{Description}:
800 This function returns the nesting level for the active parallel blocks,
801 which enclose the calling call.
804 @multitable @columnfractions .20 .80
805 @item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
808 @item @emph{Fortran}:
809 @multitable @columnfractions .20 .80
810 @item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
813 @item @emph{See also}:
814 @ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
816 @item @emph{Reference}:
817 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.20.
822 @node omp_get_ancestor_thread_num
823 @section @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
825 @item @emph{Description}:
826 This function returns the thread identification number for the given
827 nesting level of the current thread. For values of @var{level} outside
828 zero to @code{omp_get_level} -1 is returned; if @var{level} is
829 @code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
832 @multitable @columnfractions .20 .80
833 @item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
836 @item @emph{Fortran}:
837 @multitable @columnfractions .20 .80
838 @item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
839 @item @tab @code{integer level}
842 @item @emph{See also}:
843 @ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
845 @item @emph{Reference}:
846 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.18.
851 @node omp_get_cancellation
852 @section @code{omp_get_cancellation} -- Whether cancellation support is enabled
854 @item @emph{Description}:
855 This function returns @code{true} if cancellation is activated, @code{false}
856 otherwise. Here, @code{true} and @code{false} represent their language-specific
857 counterparts. Unless @env{OMP_CANCELLATION} is set true, cancellations are
861 @multitable @columnfractions .20 .80
862 @item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
865 @item @emph{Fortran}:
866 @multitable @columnfractions .20 .80
867 @item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
870 @item @emph{See also}:
871 @ref{OMP_CANCELLATION}
873 @item @emph{Reference}:
874 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.9.
879 @node omp_get_default_device
880 @section @code{omp_get_default_device} -- Get the default device for target regions
882 @item @emph{Description}:
883 Get the default device for target regions without device clause.
886 @multitable @columnfractions .20 .80
887 @item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
890 @item @emph{Fortran}:
891 @multitable @columnfractions .20 .80
892 @item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
895 @item @emph{See also}:
896 @ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
898 @item @emph{Reference}:
899 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.24.
904 @node omp_get_dynamic
905 @section @code{omp_get_dynamic} -- Dynamic teams setting
907 @item @emph{Description}:
908 This function returns @code{true} if enabled, @code{false} otherwise.
909 Here, @code{true} and @code{false} represent their language-specific
912 The dynamic team setting may be initialized at startup by the
913 @env{OMP_DYNAMIC} environment variable or at runtime using
914 @code{omp_set_dynamic}. If undefined, dynamic adjustment is
918 @multitable @columnfractions .20 .80
919 @item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
922 @item @emph{Fortran}:
923 @multitable @columnfractions .20 .80
924 @item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
927 @item @emph{See also}:
928 @ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
930 @item @emph{Reference}:
931 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.8.
937 @section @code{omp_get_level} -- Obtain the current nesting level
939 @item @emph{Description}:
940 This function returns the nesting level for the parallel blocks,
941 which enclose the calling call.
944 @multitable @columnfractions .20 .80
945 @item @emph{Prototype}: @tab @code{int omp_get_level(void);}
948 @item @emph{Fortran}:
949 @multitable @columnfractions .20 .80
950 @item @emph{Interface}: @tab @code{integer function omp_level()}
953 @item @emph{See also}:
954 @ref{omp_get_active_level}
956 @item @emph{Reference}:
957 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.17.
962 @node omp_get_max_active_levels
963 @section @code{omp_get_max_active_levels} -- Maximum number of active regions
965 @item @emph{Description}:
966 This function obtains the maximum allowed number of nested, active parallel regions.
969 @multitable @columnfractions .20 .80
970 @item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
973 @item @emph{Fortran}:
974 @multitable @columnfractions .20 .80
975 @item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
978 @item @emph{See also}:
979 @ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
981 @item @emph{Reference}:
982 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.16.
987 @node omp_get_max_threads
988 @section @code{omp_get_max_threads} -- Maximum number of threads of parallel region
990 @item @emph{Description}:
991 Return the maximum number of threads used for the current parallel region
992 that does not use the clause @code{num_threads}.
995 @multitable @columnfractions .20 .80
996 @item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
999 @item @emph{Fortran}:
1000 @multitable @columnfractions .20 .80
1001 @item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
1004 @item @emph{See also}:
1005 @ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
1007 @item @emph{Reference}:
1008 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.3.
1013 @node omp_get_nested
1014 @section @code{omp_get_nested} -- Nested parallel regions
1016 @item @emph{Description}:
1017 This function returns @code{true} if nested parallel regions are
1018 enabled, @code{false} otherwise. Here, @code{true} and @code{false}
1019 represent their language-specific counterparts.
1021 Nested parallel regions may be initialized at startup by the
1022 @env{OMP_NESTED} environment variable or at runtime using
1023 @code{omp_set_nested}. If undefined, nested parallel regions are
1024 disabled by default.
1027 @multitable @columnfractions .20 .80
1028 @item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
1031 @item @emph{Fortran}:
1032 @multitable @columnfractions .20 .80
1033 @item @emph{Interface}: @tab @code{logical function omp_get_nested()}
1036 @item @emph{See also}:
1037 @ref{omp_set_nested}, @ref{OMP_NESTED}
1039 @item @emph{Reference}:
1040 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.11.
1045 @node omp_get_num_devices
1046 @section @code{omp_get_num_devices} -- Number of target devices
1048 @item @emph{Description}:
1049 Returns the number of target devices.
1052 @multitable @columnfractions .20 .80
1053 @item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
1056 @item @emph{Fortran}:
1057 @multitable @columnfractions .20 .80
1058 @item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
1061 @item @emph{Reference}:
1062 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.25.
1067 @node omp_get_num_procs
1068 @section @code{omp_get_num_procs} -- Number of processors online
1070 @item @emph{Description}:
1071 Returns the number of processors online on that device.
1074 @multitable @columnfractions .20 .80
1075 @item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
1078 @item @emph{Fortran}:
1079 @multitable @columnfractions .20 .80
1080 @item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
1083 @item @emph{Reference}:
1084 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.5.
1089 @node omp_get_num_teams
1090 @section @code{omp_get_num_teams} -- Number of teams
1092 @item @emph{Description}:
1093 Returns the number of teams in the current team region.
1096 @multitable @columnfractions .20 .80
1097 @item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
1100 @item @emph{Fortran}:
1101 @multitable @columnfractions .20 .80
1102 @item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
1105 @item @emph{Reference}:
1106 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.26.
1111 @node omp_get_num_threads
1112 @section @code{omp_get_num_threads} -- Size of the active team
1114 @item @emph{Description}:
1115 Returns the number of threads in the current team. In a sequential section of
1116 the program @code{omp_get_num_threads} returns 1.
1118 The default team size may be initialized at startup by the
1119 @env{OMP_NUM_THREADS} environment variable. At runtime, the size
1120 of the current team may be set either by the @code{NUM_THREADS}
1121 clause or by @code{omp_set_num_threads}. If none of the above were
1122 used to define a specific value and @env{OMP_DYNAMIC} is disabled,
1123 one thread per CPU online is used.
1126 @multitable @columnfractions .20 .80
1127 @item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
1130 @item @emph{Fortran}:
1131 @multitable @columnfractions .20 .80
1132 @item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
1135 @item @emph{See also}:
1136 @ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
1138 @item @emph{Reference}:
1139 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.2.
1144 @node omp_get_proc_bind
1145 @section @code{omp_get_proc_bind} -- Whether theads may be moved between CPUs
1147 @item @emph{Description}:
1148 This functions returns the currently active thread affinity policy, which is
1149 set via @env{OMP_PROC_BIND}. Possible values are @code{omp_proc_bind_false},
1150 @code{omp_proc_bind_true}, @code{omp_proc_bind_master},
1151 @code{omp_proc_bind_close} and @code{omp_proc_bind_spread}.
1154 @multitable @columnfractions .20 .80
1155 @item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
1158 @item @emph{Fortran}:
1159 @multitable @columnfractions .20 .80
1160 @item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
1163 @item @emph{See also}:
1164 @ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
1166 @item @emph{Reference}:
1167 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.22.
1172 @node omp_get_schedule
1173 @section @code{omp_get_schedule} -- Obtain the runtime scheduling method
1175 @item @emph{Description}:
1176 Obtain the runtime scheduling method. The @var{kind} argument will be
1177 set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
1178 @code{omp_sched_guided} or @code{omp_sched_auto}. The second argument,
1179 @var{modifier}, is set to the chunk size.
1182 @multitable @columnfractions .20 .80
1183 @item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *modifier);}
1186 @item @emph{Fortran}:
1187 @multitable @columnfractions .20 .80
1188 @item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, modifier)}
1189 @item @tab @code{integer(kind=omp_sched_kind) kind}
1190 @item @tab @code{integer modifier}
1193 @item @emph{See also}:
1194 @ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
1196 @item @emph{Reference}:
1197 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.13.
1202 @node omp_get_team_num
1203 @section @code{omp_get_team_num} -- Get team number
1205 @item @emph{Description}:
1206 Returns the team number of the calling thread.
1209 @multitable @columnfractions .20 .80
1210 @item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
1213 @item @emph{Fortran}:
1214 @multitable @columnfractions .20 .80
1215 @item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
1218 @item @emph{Reference}:
1219 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.27.
1224 @node omp_get_team_size
1225 @section @code{omp_get_team_size} -- Number of threads in a team
1227 @item @emph{Description}:
1228 This function returns the number of threads in a thread team to which
1229 either the current thread or its ancestor belongs. For values of @var{level}
1230 outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
1231 1 is returned, and for @code{omp_get_level}, the result is identical
1232 to @code{omp_get_num_threads}.
1235 @multitable @columnfractions .20 .80
1236 @item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
1239 @item @emph{Fortran}:
1240 @multitable @columnfractions .20 .80
1241 @item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
1242 @item @tab @code{integer level}
1245 @item @emph{See also}:
1246 @ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
1248 @item @emph{Reference}:
1249 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.19.
1254 @node omp_get_thread_limit
1255 @section @code{omp_get_thread_limit} -- Maximum number of threads
1257 @item @emph{Description}:
1258 Return the maximum number of threads of the program.
1261 @multitable @columnfractions .20 .80
1262 @item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
1265 @item @emph{Fortran}:
1266 @multitable @columnfractions .20 .80
1267 @item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
1270 @item @emph{See also}:
1271 @ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
1273 @item @emph{Reference}:
1274 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.14.
1279 @node omp_get_thread_num
1280 @section @code{omp_get_thread_num} -- Current thread ID
1282 @item @emph{Description}:
1283 Returns a unique thread identification number within the current team.
1284 In a sequential parts of the program, @code{omp_get_thread_num}
1285 always returns 0. In parallel regions the return value varies
1286 from 0 to @code{omp_get_num_threads}-1 inclusive. The return
1287 value of the master thread of a team is always 0.
1290 @multitable @columnfractions .20 .80
1291 @item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
1294 @item @emph{Fortran}:
1295 @multitable @columnfractions .20 .80
1296 @item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
1299 @item @emph{See also}:
1300 @ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
1302 @item @emph{Reference}:
1303 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.4.
1308 @node omp_in_parallel
1309 @section @code{omp_in_parallel} -- Whether a parallel region is active
1311 @item @emph{Description}:
1312 This function returns @code{true} if currently running in parallel,
1313 @code{false} otherwise. Here, @code{true} and @code{false} represent
1314 their language-specific counterparts.
1317 @multitable @columnfractions .20 .80
1318 @item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
1321 @item @emph{Fortran}:
1322 @multitable @columnfractions .20 .80
1323 @item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
1326 @item @emph{Reference}:
1327 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.6.
1332 @section @code{omp_in_final} -- Whether in final or included task region
1334 @item @emph{Description}:
1335 This function returns @code{true} if currently running in a final
1336 or included task region, @code{false} otherwise. Here, @code{true}
1337 and @code{false} represent their language-specific counterparts.
1340 @multitable @columnfractions .20 .80
1341 @item @emph{Prototype}: @tab @code{int omp_in_final(void);}
1344 @item @emph{Fortran}:
1345 @multitable @columnfractions .20 .80
1346 @item @emph{Interface}: @tab @code{logical function omp_in_final()}
1349 @item @emph{Reference}:
1350 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.21.
1355 @node omp_is_initial_device
1356 @section @code{omp_is_initial_device} -- Whether executing on the host device
1358 @item @emph{Description}:
1359 This function returns @code{true} if currently running on the host device,
1360 @code{false} otherwise. Here, @code{true} and @code{false} represent
1361 their language-specific counterparts.
1364 @multitable @columnfractions .20 .80
1365 @item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
1368 @item @emph{Fortran}:
1369 @multitable @columnfractions .20 .80
1370 @item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
1373 @item @emph{Reference}:
1374 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.28.
1379 @node omp_set_default_device
1380 @section @code{omp_set_default_device} -- Set the default device for target regions
1382 @item @emph{Description}:
1383 Set the default device for target regions without device clause. The argument
1384 shall be a nonnegative device number.
1387 @multitable @columnfractions .20 .80
1388 @item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
1391 @item @emph{Fortran}:
1392 @multitable @columnfractions .20 .80
1393 @item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
1394 @item @tab @code{integer device_num}
1397 @item @emph{See also}:
1398 @ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
1400 @item @emph{Reference}:
1401 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.23.
1406 @node omp_set_dynamic
1407 @section @code{omp_set_dynamic} -- Enable/disable dynamic teams
1409 @item @emph{Description}:
1410 Enable or disable the dynamic adjustment of the number of threads
1411 within a team. The function takes the language-specific equivalent
1412 of @code{true} and @code{false}, where @code{true} enables dynamic
1413 adjustment of team sizes and @code{false} disables it.
1416 @multitable @columnfractions .20 .80
1417 @item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
1420 @item @emph{Fortran}:
1421 @multitable @columnfractions .20 .80
1422 @item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
1423 @item @tab @code{logical, intent(in) :: dynamic_threads}
1426 @item @emph{See also}:
1427 @ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
1429 @item @emph{Reference}:
1430 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.7.
1435 @node omp_set_max_active_levels
1436 @section @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
1438 @item @emph{Description}:
1439 This function limits the maximum allowed number of nested, active
1443 @multitable @columnfractions .20 .80
1444 @item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
1447 @item @emph{Fortran}:
1448 @multitable @columnfractions .20 .80
1449 @item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
1450 @item @tab @code{integer max_levels}
1453 @item @emph{See also}:
1454 @ref{omp_get_max_active_levels}, @ref{omp_get_active_level}
1456 @item @emph{Reference}:
1457 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.15.
1462 @node omp_set_nested
1463 @section @code{omp_set_nested} -- Enable/disable nested parallel regions
1465 @item @emph{Description}:
1466 Enable or disable nested parallel regions, i.e., whether team members
1467 are allowed to create new teams. The function takes the language-specific
1468 equivalent of @code{true} and @code{false}, where @code{true} enables
1469 dynamic adjustment of team sizes and @code{false} disables it.
1472 @multitable @columnfractions .20 .80
1473 @item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
1476 @item @emph{Fortran}:
1477 @multitable @columnfractions .20 .80
1478 @item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
1479 @item @tab @code{logical, intent(in) :: nested}
1482 @item @emph{See also}:
1483 @ref{OMP_NESTED}, @ref{omp_get_nested}
1485 @item @emph{Reference}:
1486 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.10.
1491 @node omp_set_num_threads
1492 @section @code{omp_set_num_threads} -- Set upper team size limit
1494 @item @emph{Description}:
1495 Specifies the number of threads used by default in subsequent parallel
1496 sections, if those do not specify a @code{num_threads} clause. The
1497 argument of @code{omp_set_num_threads} shall be a positive integer.
1500 @multitable @columnfractions .20 .80
1501 @item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
1504 @item @emph{Fortran}:
1505 @multitable @columnfractions .20 .80
1506 @item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
1507 @item @tab @code{integer, intent(in) :: num_threads}
1510 @item @emph{See also}:
1511 @ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
1513 @item @emph{Reference}:
1514 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.1.
1519 @node omp_set_schedule
1520 @section @code{omp_set_schedule} -- Set the runtime scheduling method
1522 @item @emph{Description}:
1523 Sets the runtime scheduling method. The @var{kind} argument can have the
1524 value @code{omp_sched_static}, @code{omp_sched_dynamic},
1525 @code{omp_sched_guided} or @code{omp_sched_auto}. Except for
1526 @code{omp_sched_auto}, the chunk size is set to the value of
1527 @var{modifier} if positive, or to the default value if zero or negative.
1528 For @code{omp_sched_auto} the @var{modifier} argument is ignored.
1531 @multitable @columnfractions .20 .80
1532 @item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int modifier);}
1535 @item @emph{Fortran}:
1536 @multitable @columnfractions .20 .80
1537 @item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, modifier)}
1538 @item @tab @code{integer(kind=omp_sched_kind) kind}
1539 @item @tab @code{integer modifier}
1542 @item @emph{See also}:
1543 @ref{omp_get_schedule}
1546 @item @emph{Reference}:
1547 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.12.
1553 @section @code{omp_init_lock} -- Initialize simple lock
1555 @item @emph{Description}:
1556 Initialize a simple lock. After initialization, the lock is in
1560 @multitable @columnfractions .20 .80
1561 @item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
1564 @item @emph{Fortran}:
1565 @multitable @columnfractions .20 .80
1566 @item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
1567 @item @tab @code{integer(omp_lock_kind), intent(out) :: svar}
1570 @item @emph{See also}:
1571 @ref{omp_destroy_lock}
1573 @item @emph{Reference}:
1574 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.1.
1580 @section @code{omp_set_lock} -- Wait for and set simple lock
1582 @item @emph{Description}:
1583 Before setting a simple lock, the lock variable must be initialized by
1584 @code{omp_init_lock}. The calling thread is blocked until the lock
1585 is available. If the lock is already held by the current thread,
1589 @multitable @columnfractions .20 .80
1590 @item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
1593 @item @emph{Fortran}:
1594 @multitable @columnfractions .20 .80
1595 @item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
1596 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1599 @item @emph{See also}:
1600 @ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
1602 @item @emph{Reference}:
1603 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.3.
1609 @section @code{omp_test_lock} -- Test and set simple lock if available
1611 @item @emph{Description}:
1612 Before setting a simple lock, the lock variable must be initialized by
1613 @code{omp_init_lock}. Contrary to @code{omp_set_lock}, @code{omp_test_lock}
1614 does not block if the lock is not available. This function returns
1615 @code{true} upon success, @code{false} otherwise. Here, @code{true} and
1616 @code{false} represent their language-specific counterparts.
1619 @multitable @columnfractions .20 .80
1620 @item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
1623 @item @emph{Fortran}:
1624 @multitable @columnfractions .20 .80
1625 @item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
1626 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1629 @item @emph{See also}:
1630 @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1632 @item @emph{Reference}:
1633 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.5.
1638 @node omp_unset_lock
1639 @section @code{omp_unset_lock} -- Unset simple lock
1641 @item @emph{Description}:
1642 A simple lock about to be unset must have been locked by @code{omp_set_lock}
1643 or @code{omp_test_lock} before. In addition, the lock must be held by the
1644 thread calling @code{omp_unset_lock}. Then, the lock becomes unlocked. If one
1645 or more threads attempted to set the lock before, one of them is chosen to,
1646 again, set the lock to itself.
1649 @multitable @columnfractions .20 .80
1650 @item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
1653 @item @emph{Fortran}:
1654 @multitable @columnfractions .20 .80
1655 @item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
1656 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1659 @item @emph{See also}:
1660 @ref{omp_set_lock}, @ref{omp_test_lock}
1662 @item @emph{Reference}:
1663 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.4.
1668 @node omp_destroy_lock
1669 @section @code{omp_destroy_lock} -- Destroy simple lock
1671 @item @emph{Description}:
1672 Destroy a simple lock. In order to be destroyed, a simple lock must be
1673 in the unlocked state.
1676 @multitable @columnfractions .20 .80
1677 @item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
1680 @item @emph{Fortran}:
1681 @multitable @columnfractions .20 .80
1682 @item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
1683 @item @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1686 @item @emph{See also}:
1689 @item @emph{Reference}:
1690 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.2.
1695 @node omp_init_nest_lock
1696 @section @code{omp_init_nest_lock} -- Initialize nested lock
1698 @item @emph{Description}:
1699 Initialize a nested lock. After initialization, the lock is in
1700 an unlocked state and the nesting count is set to zero.
1703 @multitable @columnfractions .20 .80
1704 @item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
1707 @item @emph{Fortran}:
1708 @multitable @columnfractions .20 .80
1709 @item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
1710 @item @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
1713 @item @emph{See also}:
1714 @ref{omp_destroy_nest_lock}
1716 @item @emph{Reference}:
1717 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.1.
1721 @node omp_set_nest_lock
1722 @section @code{omp_set_nest_lock} -- Wait for and set nested lock
1724 @item @emph{Description}:
1725 Before setting a nested lock, the lock variable must be initialized by
1726 @code{omp_init_nest_lock}. The calling thread is blocked until the lock
1727 is available. If the lock is already held by the current thread, the
1728 nesting count for the lock is incremented.
1731 @multitable @columnfractions .20 .80
1732 @item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
1735 @item @emph{Fortran}:
1736 @multitable @columnfractions .20 .80
1737 @item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
1738 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1741 @item @emph{See also}:
1742 @ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
1744 @item @emph{Reference}:
1745 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.3.
1750 @node omp_test_nest_lock
1751 @section @code{omp_test_nest_lock} -- Test and set nested lock if available
1753 @item @emph{Description}:
1754 Before setting a nested lock, the lock variable must be initialized by
1755 @code{omp_init_nest_lock}. Contrary to @code{omp_set_nest_lock},
1756 @code{omp_test_nest_lock} does not block if the lock is not available.
1757 If the lock is already held by the current thread, the new nesting count
1758 is returned. Otherwise, the return value equals zero.
1761 @multitable @columnfractions .20 .80
1762 @item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
1765 @item @emph{Fortran}:
1766 @multitable @columnfractions .20 .80
1767 @item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
1768 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1772 @item @emph{See also}:
1773 @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1775 @item @emph{Reference}:
1776 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.5.
1781 @node omp_unset_nest_lock
1782 @section @code{omp_unset_nest_lock} -- Unset nested lock
1784 @item @emph{Description}:
1785 A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
1786 or @code{omp_test_nested_lock} before. In addition, the lock must be held by the
1787 thread calling @code{omp_unset_nested_lock}. If the nesting count drops to zero, the
1788 lock becomes unlocked. If one ore more threads attempted to set the lock before,
1789 one of them is chosen to, again, set the lock to itself.
1792 @multitable @columnfractions .20 .80
1793 @item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
1796 @item @emph{Fortran}:
1797 @multitable @columnfractions .20 .80
1798 @item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
1799 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1802 @item @emph{See also}:
1803 @ref{omp_set_nest_lock}
1805 @item @emph{Reference}:
1806 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.4.
1811 @node omp_destroy_nest_lock
1812 @section @code{omp_destroy_nest_lock} -- Destroy nested lock
1814 @item @emph{Description}:
1815 Destroy a nested lock. In order to be destroyed, a nested lock must be
1816 in the unlocked state and its nesting count must equal zero.
1819 @multitable @columnfractions .20 .80
1820 @item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
1823 @item @emph{Fortran}:
1824 @multitable @columnfractions .20 .80
1825 @item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
1826 @item @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1829 @item @emph{See also}:
1832 @item @emph{Reference}:
1833 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.2.
1839 @section @code{omp_get_wtick} -- Get timer precision
1841 @item @emph{Description}:
1842 Gets the timer precision, i.e., the number of seconds between two
1843 successive clock ticks.
1846 @multitable @columnfractions .20 .80
1847 @item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
1850 @item @emph{Fortran}:
1851 @multitable @columnfractions .20 .80
1852 @item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
1855 @item @emph{See also}:
1858 @item @emph{Reference}:
1859 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.4.2.
1865 @section @code{omp_get_wtime} -- Elapsed wall clock time
1867 @item @emph{Description}:
1868 Elapsed wall clock time in seconds. The time is measured per thread, no
1869 guarantee can be made that two distinct threads measure the same time.
1870 Time is measured from some "time in the past", which is an arbitrary time
1871 guaranteed not to change during the execution of the program.
1874 @multitable @columnfractions .20 .80
1875 @item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
1878 @item @emph{Fortran}:
1879 @multitable @columnfractions .20 .80
1880 @item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
1883 @item @emph{See also}:
1886 @item @emph{Reference}:
1887 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.4.1.
1892 @c ---------------------------------------------------------------------
1893 @c OpenMP Environment Variables
1894 @c ---------------------------------------------------------------------
1896 @node Environment Variables
1897 @chapter OpenMP Environment Variables
1899 The environment variables which beginning with @env{OMP_} are defined by
1900 section 4 of the OpenMP specification in version 4.0, while those
1901 beginning with @env{GOMP_} are GNU extensions.
1904 * OMP_CANCELLATION:: Set whether cancellation is activated
1905 * OMP_DISPLAY_ENV:: Show OpenMP version and environment variables
1906 * OMP_DEFAULT_DEVICE:: Set the device used in target regions
1907 * OMP_DYNAMIC:: Dynamic adjustment of threads
1908 * OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
1909 * OMP_NESTED:: Nested parallel regions
1910 * OMP_NUM_THREADS:: Specifies the number of threads to use
1911 * OMP_PROC_BIND:: Whether theads may be moved between CPUs
1912 * OMP_PLACES:: Specifies on which CPUs the theads should be placed
1913 * OMP_STACKSIZE:: Set default thread stack size
1914 * OMP_SCHEDULE:: How threads are scheduled
1915 * OMP_THREAD_LIMIT:: Set the maximum number of threads
1916 * OMP_WAIT_POLICY:: How waiting threads are handled
1917 * GOMP_CPU_AFFINITY:: Bind threads to specific CPUs
1918 * GOMP_DEBUG:: Enable debugging output
1919 * GOMP_STACKSIZE:: Set default thread stack size
1920 * GOMP_SPINCOUNT:: Set the busy-wait spin count
1924 @node OMP_CANCELLATION
1925 @section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
1926 @cindex Environment Variable
1928 @item @emph{Description}:
1929 If set to @code{TRUE}, the cancellation is activated. If set to @code{FALSE} or
1930 if unset, cancellation is disabled and the @code{cancel} construct is ignored.
1932 @item @emph{See also}:
1933 @ref{omp_get_cancellation}
1935 @item @emph{Reference}:
1936 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.11
1941 @node OMP_DISPLAY_ENV
1942 @section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
1943 @cindex Environment Variable
1945 @item @emph{Description}:
1946 If set to @code{TRUE}, the OpenMP version number and the values
1947 associated with the OpenMP environment variables are printed to @code{stderr}.
1948 If set to @code{VERBOSE}, it additionally shows the value of the environment
1949 variables which are GNU extensions. If undefined or set to @code{FALSE},
1950 this information will not be shown.
1953 @item @emph{Reference}:
1954 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.12
1959 @node OMP_DEFAULT_DEVICE
1960 @section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
1961 @cindex Environment Variable
1963 @item @emph{Description}:
1964 Set to choose the device which is used in a @code{target} region, unless the
1965 value is overridden by @code{omp_set_default_device} or by a @code{device}
1966 clause. The value shall be the nonnegative device number. If no device with
1967 the given device number exists, the code is executed on the host. If unset,
1968 device number 0 will be used.
1971 @item @emph{See also}:
1972 @ref{omp_get_default_device}, @ref{omp_set_default_device},
1974 @item @emph{Reference}:
1975 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.11
1981 @section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
1982 @cindex Environment Variable
1984 @item @emph{Description}:
1985 Enable or disable the dynamic adjustment of the number of threads
1986 within a team. The value of this environment variable shall be
1987 @code{TRUE} or @code{FALSE}. If undefined, dynamic adjustment is
1988 disabled by default.
1990 @item @emph{See also}:
1991 @ref{omp_set_dynamic}
1993 @item @emph{Reference}:
1994 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.3
1999 @node OMP_MAX_ACTIVE_LEVELS
2000 @section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
2001 @cindex Environment Variable
2003 @item @emph{Description}:
2004 Specifies the initial value for the maximum number of nested parallel
2005 regions. The value of this variable shall be a positive integer.
2006 If undefined, the number of active levels is unlimited.
2008 @item @emph{See also}:
2009 @ref{omp_set_max_active_levels}
2011 @item @emph{Reference}:
2012 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.9
2018 @section @env{OMP_NESTED} -- Nested parallel regions
2019 @cindex Environment Variable
2020 @cindex Implementation specific setting
2022 @item @emph{Description}:
2023 Enable or disable nested parallel regions, i.e., whether team members
2024 are allowed to create new teams. The value of this environment variable
2025 shall be @code{TRUE} or @code{FALSE}. If undefined, nested parallel
2026 regions are disabled by default.
2028 @item @emph{See also}:
2029 @ref{omp_set_nested}
2031 @item @emph{Reference}:
2032 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.6
2037 @node OMP_NUM_THREADS
2038 @section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
2039 @cindex Environment Variable
2040 @cindex Implementation specific setting
2042 @item @emph{Description}:
2043 Specifies the default number of threads to use in parallel regions. The
2044 value of this variable shall be a comma-separated list of positive integers;
2045 the value specified the number of threads to use for the corresponding nested
2046 level. If undefined one thread per CPU is used.
2048 @item @emph{See also}:
2049 @ref{omp_set_num_threads}
2051 @item @emph{Reference}:
2052 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.2
2058 @section @env{OMP_PROC_BIND} -- Whether theads may be moved between CPUs
2059 @cindex Environment Variable
2061 @item @emph{Description}:
2062 Specifies whether threads may be moved between processors. If set to
2063 @code{TRUE}, OpenMP theads should not be moved; if set to @code{FALSE}
2064 they may be moved. Alternatively, a comma separated list with the
2065 values @code{MASTER}, @code{CLOSE} and @code{SPREAD} can be used to specify
2066 the thread affinity policy for the corresponding nesting level. With
2067 @code{MASTER} the worker threads are in the same place partition as the
2068 master thread. With @code{CLOSE} those are kept close to the master thread
2069 in contiguous place partitions. And with @code{SPREAD} a sparse distribution
2070 across the place partitions is used.
2072 When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
2073 @env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
2075 @item @emph{See also}:
2076 @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind}
2078 @item @emph{Reference}:
2079 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.4
2085 @section @env{OMP_PLACES} -- Specifies on which CPUs the theads should be placed
2086 @cindex Environment Variable
2088 @item @emph{Description}:
2089 The thread placement can be either specified using an abstract name or by an
2090 explicit list of the places. The abstract names @code{threads}, @code{cores}
2091 and @code{sockets} can be optionally followed by a positive number in
2092 parentheses, which denotes the how many places shall be created. With
2093 @code{threads} each place corresponds to a single hardware thread; @code{cores}
2094 to a single core with the corresponding number of hardware threads; and with
2095 @code{sockets} the place corresponds to a single socket. The resulting
2096 placement can be shown by setting the @env{OMP_DISPLAY_ENV} environment
2099 Alternatively, the placement can be specified explicitly as comma-separated
2100 list of places. A place is specified by set of nonnegative numbers in curly
2101 braces, denoting the denoting the hardware threads. The hardware threads
2102 belonging to a place can either be specified as comma-separated list of
2103 nonnegative thread numbers or using an interval. Multiple places can also be
2104 either specified by a comma-separated list of places or by an interval. To
2105 specify an interval, a colon followed by the count is placed after after
2106 the hardware thread number or the place. Optionally, the length can be
2107 followed by a colon and the stride number -- otherwise a unit stride is
2108 assumed. For instance, the following specifies the same places list:
2109 @code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
2110 @code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
2112 If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
2113 @env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
2114 between CPUs following no placement policy.
2116 @item @emph{See also}:
2117 @ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
2118 @ref{OMP_DISPLAY_ENV}
2120 @item @emph{Reference}:
2121 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.5
2127 @section @env{OMP_STACKSIZE} -- Set default thread stack size
2128 @cindex Environment Variable
2130 @item @emph{Description}:
2131 Set the default thread stack size in kilobytes, unless the number
2132 is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
2133 case the size is, respectively, in bytes, kilobytes, megabytes
2134 or gigabytes. This is different from @code{pthread_attr_setstacksize}
2135 which gets the number of bytes as an argument. If the stack size cannot
2136 be set due to system constraints, an error is reported and the initial
2137 stack size is left unchanged. If undefined, the stack size is system
2140 @item @emph{Reference}:
2141 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.7
2147 @section @env{OMP_SCHEDULE} -- How threads are scheduled
2148 @cindex Environment Variable
2149 @cindex Implementation specific setting
2151 @item @emph{Description}:
2152 Allows to specify @code{schedule type} and @code{chunk size}.
2153 The value of the variable shall have the form: @code{type[,chunk]} where
2154 @code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
2155 The optional @code{chunk} size shall be a positive integer. If undefined,
2156 dynamic scheduling and a chunk size of 1 is used.
2158 @item @emph{See also}:
2159 @ref{omp_set_schedule}
2161 @item @emph{Reference}:
2162 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Sections 2.7.1 and 4.1
2167 @node OMP_THREAD_LIMIT
2168 @section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
2169 @cindex Environment Variable
2171 @item @emph{Description}:
2172 Specifies the number of threads to use for the whole program. The
2173 value of this variable shall be a positive integer. If undefined,
2174 the number of threads is not limited.
2176 @item @emph{See also}:
2177 @ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
2179 @item @emph{Reference}:
2180 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.10
2185 @node OMP_WAIT_POLICY
2186 @section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
2187 @cindex Environment Variable
2189 @item @emph{Description}:
2190 Specifies whether waiting threads should be active or passive. If
2191 the value is @code{PASSIVE}, waiting threads should not consume CPU
2192 power while waiting; while the value is @code{ACTIVE} specifies that
2193 they should. If undefined, threads wait actively for a short time
2194 before waiting passively.
2196 @item @emph{See also}:
2197 @ref{GOMP_SPINCOUNT}
2199 @item @emph{Reference}:
2200 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.8
2205 @node GOMP_CPU_AFFINITY
2206 @section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
2207 @cindex Environment Variable
2209 @item @emph{Description}:
2210 Binds threads to specific CPUs. The variable should contain a space-separated
2211 or comma-separated list of CPUs. This list may contain different kinds of
2212 entries: either single CPU numbers in any order, a range of CPUs (M-N)
2213 or a range with some stride (M-N:S). CPU numbers are zero based. For example,
2214 @code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
2215 to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
2216 CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
2217 and 14 respectively and then start assigning back from the beginning of
2218 the list. @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
2220 There is no GNU OpenMP library routine to determine whether a CPU affinity
2221 specification is in effect. As a workaround, language-specific library
2222 functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
2223 Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
2224 environment variable. A defined CPU affinity on startup cannot be changed
2225 or disabled during the runtime of the application.
2227 If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
2228 @env{OMP_PROC_BIND} has a higher precedence. If neither has been set and
2229 @env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
2230 @code{FALSE}, the host system will handle the assignment of threads to CPUs.
2232 @item @emph{See also}:
2233 @ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
2239 @section @env{GOMP_DEBUG} -- Enable debugging output
2240 @cindex Environment Variable
2242 @item @emph{Description}:
2243 Enable debugging output. The variable should be set to @code{0}
2244 (disabled, also the default if not set), or @code{1} (enabled).
2246 If enabled, some debugging output will be printed during execution.
2247 This is currently not specified in more detail, and subject to change.
2252 @node GOMP_STACKSIZE
2253 @section @env{GOMP_STACKSIZE} -- Set default thread stack size
2254 @cindex Environment Variable
2255 @cindex Implementation specific setting
2257 @item @emph{Description}:
2258 Set the default thread stack size in kilobytes. This is different from
2259 @code{pthread_attr_setstacksize} which gets the number of bytes as an
2260 argument. If the stack size cannot be set due to system constraints, an
2261 error is reported and the initial stack size is left unchanged. If undefined,
2262 the stack size is system dependent.
2264 @item @emph{See also}:
2267 @item @emph{Reference}:
2268 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
2269 GCC Patches Mailinglist},
2270 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
2271 GCC Patches Mailinglist}
2276 @node GOMP_SPINCOUNT
2277 @section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
2278 @cindex Environment Variable
2279 @cindex Implementation specific setting
2281 @item @emph{Description}:
2282 Determines how long a threads waits actively with consuming CPU power
2283 before waiting passively without consuming CPU power. The value may be
2284 either @code{INFINITE}, @code{INFINITY} to always wait actively or an
2285 integer which gives the number of spins of the busy-wait loop. The
2286 integer may optionally be followed by the following suffixes acting
2287 as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
2288 million), @code{G} (giga, billion), or @code{T} (tera, trillion).
2289 If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
2290 300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
2291 30 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
2292 If there are more OpenMP threads than available CPUs, 1000 and 100
2293 spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
2294 undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
2295 or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
2297 @item @emph{See also}:
2298 @ref{OMP_WAIT_POLICY}
2303 @c ---------------------------------------------------------------------
2305 @c ---------------------------------------------------------------------
2307 @node The libgomp ABI
2308 @chapter The libgomp ABI
2310 The following sections present notes on the external ABI as
2311 presented by libgomp. Only maintainers should need them.
2314 * Implementing MASTER construct::
2315 * Implementing CRITICAL construct::
2316 * Implementing ATOMIC construct::
2317 * Implementing FLUSH construct::
2318 * Implementing BARRIER construct::
2319 * Implementing THREADPRIVATE construct::
2320 * Implementing PRIVATE clause::
2321 * Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
2322 * Implementing REDUCTION clause::
2323 * Implementing PARALLEL construct::
2324 * Implementing FOR construct::
2325 * Implementing ORDERED construct::
2326 * Implementing SECTIONS construct::
2327 * Implementing SINGLE construct::
2328 * Implementing OpenACC's PARALLEL construct::
2332 @node Implementing MASTER construct
2333 @section Implementing MASTER construct
2336 if (omp_get_thread_num () == 0)
2340 Alternately, we generate two copies of the parallel subfunction
2341 and only include this in the version run by the master thread.
2342 Surely this is not worthwhile though...
2346 @node Implementing CRITICAL construct
2347 @section Implementing CRITICAL construct
2349 Without a specified name,
2352 void GOMP_critical_start (void);
2353 void GOMP_critical_end (void);
2356 so that we don't get COPY relocations from libgomp to the main
2359 With a specified name, use omp_set_lock and omp_unset_lock with
2360 name being transformed into a variable declared like
2363 omp_lock_t gomp_critical_user_<name> __attribute__((common))
2366 Ideally the ABI would specify that all zero is a valid unlocked
2367 state, and so we wouldn't need to initialize this at
2372 @node Implementing ATOMIC construct
2373 @section Implementing ATOMIC construct
2375 The target should implement the @code{__sync} builtins.
2377 Failing that we could add
2380 void GOMP_atomic_enter (void)
2381 void GOMP_atomic_exit (void)
2384 which reuses the regular lock code, but with yet another lock
2385 object private to the library.
2389 @node Implementing FLUSH construct
2390 @section Implementing FLUSH construct
2392 Expands to the @code{__sync_synchronize} builtin.
2396 @node Implementing BARRIER construct
2397 @section Implementing BARRIER construct
2400 void GOMP_barrier (void)
2404 @node Implementing THREADPRIVATE construct
2405 @section Implementing THREADPRIVATE construct
2407 In _most_ cases we can map this directly to @code{__thread}. Except
2408 that OMP allows constructors for C++ objects. We can either
2409 refuse to support this (how often is it used?) or we can
2410 implement something akin to .ctors.
2412 Even more ideally, this ctor feature is handled by extensions
2413 to the main pthreads library. Failing that, we can have a set
2414 of entry points to register ctor functions to be called.
2418 @node Implementing PRIVATE clause
2419 @section Implementing PRIVATE clause
2421 In association with a PARALLEL, or within the lexical extent
2422 of a PARALLEL block, the variable becomes a local variable in
2423 the parallel subfunction.
2425 In association with FOR or SECTIONS blocks, create a new
2426 automatic variable within the current function. This preserves
2427 the semantic of new variable creation.
2431 @node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
2432 @section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
2434 This seems simple enough for PARALLEL blocks. Create a private
2435 struct for communicating between the parent and subfunction.
2436 In the parent, copy in values for scalar and "small" structs;
2437 copy in addresses for others TREE_ADDRESSABLE types. In the
2438 subfunction, copy the value into the local variable.
2440 It is not clear what to do with bare FOR or SECTION blocks.
2441 The only thing I can figure is that we do something like:
2444 #pragma omp for firstprivate(x) lastprivate(y)
2445 for (int i = 0; i < n; ++i)
2462 where the "x=x" and "y=y" assignments actually have different
2463 uids for the two variables, i.e. not something you could write
2464 directly in C. Presumably this only makes sense if the "outer"
2465 x and y are global variables.
2467 COPYPRIVATE would work the same way, except the structure
2468 broadcast would have to happen via SINGLE machinery instead.
2472 @node Implementing REDUCTION clause
2473 @section Implementing REDUCTION clause
2475 The private struct mentioned in the previous section should have
2476 a pointer to an array of the type of the variable, indexed by the
2477 thread's @var{team_id}. The thread stores its final value into the
2478 array, and after the barrier, the master thread iterates over the
2479 array to collect the values.
2482 @node Implementing PARALLEL construct
2483 @section Implementing PARALLEL construct
2486 #pragma omp parallel
2495 void subfunction (void *data)
2502 GOMP_parallel_start (subfunction, &data, num_threads);
2503 subfunction (&data);
2504 GOMP_parallel_end ();
2508 void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
2511 The @var{FN} argument is the subfunction to be run in parallel.
2513 The @var{DATA} argument is a pointer to a structure used to
2514 communicate data in and out of the subfunction, as discussed
2515 above with respect to FIRSTPRIVATE et al.
2517 The @var{NUM_THREADS} argument is 1 if an IF clause is present
2518 and false, or the value of the NUM_THREADS clause, if
2521 The function needs to create the appropriate number of
2522 threads and/or launch them from the dock. It needs to
2523 create the team structure and assign team ids.
2526 void GOMP_parallel_end (void)
2529 Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
2533 @node Implementing FOR construct
2534 @section Implementing FOR construct
2537 #pragma omp parallel for
2538 for (i = lb; i <= ub; i++)
2545 void subfunction (void *data)
2548 while (GOMP_loop_static_next (&_s0, &_e0))
2551 for (i = _s0; i < _e1; i++)
2554 GOMP_loop_end_nowait ();
2557 GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
2559 GOMP_parallel_end ();
2563 #pragma omp for schedule(runtime)
2564 for (i = 0; i < n; i++)
2573 if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
2576 for (i = _s0, i < _e0; i++)
2578 @} while (GOMP_loop_runtime_next (&_s0, _&e0));
2583 Note that while it looks like there is trickiness to propagating
2584 a non-constant STEP, there isn't really. We're explicitly allowed
2585 to evaluate it as many times as we want, and any variables involved
2586 should automatically be handled as PRIVATE or SHARED like any other
2587 variables. So the expression should remain evaluable in the
2588 subfunction. We can also pull it into a local variable if we like,
2589 but since its supposed to remain unchanged, we can also not if we like.
2591 If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
2592 able to get away with no work-sharing context at all, since we can
2593 simply perform the arithmetic directly in each thread to divide up
2594 the iterations. Which would mean that we wouldn't need to call any
2597 There are separate routines for handling loops with an ORDERED
2598 clause. Bookkeeping for that is non-trivial...
2602 @node Implementing ORDERED construct
2603 @section Implementing ORDERED construct
2606 void GOMP_ordered_start (void)
2607 void GOMP_ordered_end (void)
2612 @node Implementing SECTIONS construct
2613 @section Implementing SECTIONS construct
2618 #pragma omp sections
2632 for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
2649 @node Implementing SINGLE construct
2650 @section Implementing SINGLE construct
2664 if (GOMP_single_start ())
2672 #pragma omp single copyprivate(x)
2679 datap = GOMP_single_copy_start ();
2684 GOMP_single_copy_end (&data);
2693 @node Implementing OpenACC's PARALLEL construct
2694 @section Implementing OpenACC's PARALLEL construct
2697 void GOACC_parallel ()
2702 @c ---------------------------------------------------------------------
2704 @c ---------------------------------------------------------------------
2706 @node Reporting Bugs
2707 @chapter Reporting Bugs
2709 Bugs in the GNU OpenACC or OpenMP implementation should be reported via
2710 @uref{http://gcc.gnu.org/bugzilla/, Bugzilla}. As appropriate, please
2711 add "openacc", or "openmp", or both to the keywords field in the bug
2716 @c ---------------------------------------------------------------------
2717 @c GNU General Public License
2718 @c ---------------------------------------------------------------------
2720 @include gpl_v3.texi
2724 @c ---------------------------------------------------------------------
2725 @c GNU Free Documentation License
2726 @c ---------------------------------------------------------------------
2732 @c ---------------------------------------------------------------------
2733 @c Funding Free Software
2734 @c ---------------------------------------------------------------------
2736 @include funding.texi
2738 @c ---------------------------------------------------------------------
2740 @c ---------------------------------------------------------------------
2743 @unnumbered Library Index