libgomp/libgomp.texi

   1 \input texinfo @c -*-texinfo-*-
   2
   3 @c %**start of header
   4 @setfilename libgomp.info
   5 @settitle GNU libgomp
   6 @c %**end of header
   7
   8
   9 @copying
  10 Copyright @copyright{} 2006-2014 Free Software Foundation, Inc.
  11
  12 Permission is granted to copy, distribute and/or modify this document
  13 under the terms of the GNU Free Documentation License, Version 1.3 or
  14 any later version published by the Free Software Foundation; with the
  15 Invariant Sections being ``Funding Free Software'', the Front-Cover
  16 texts being (a) (see below), and with the Back-Cover Texts being (b)
  17 (see below).  A copy of the license is included in the section entitled
  18 ``GNU Free Documentation License''.
  19
  20 (a) The FSF's Front-Cover Text is:
  21
  22      A GNU Manual
  23
  24 (b) The FSF's Back-Cover Text is:
  25
  26      You have freedom to copy and modify this GNU Manual, like GNU
  27      software.  Copies published by the Free Software Foundation raise
  28      funds for GNU development.
  29 @end copying
  30
  31 @ifinfo
  32 @dircategory GNU Libraries
  33 @direntry
  34 * libgomp: (libgomp).                    GNU OpenACC and OpenMP runtime library
  35 @end direntry
  36
  37 This manual documents the GNU implementation of the OpenACC API for
  38 offloading of code to accelerator devices in C/C++ and Fortran and
  39 the GNU implementation of the OpenMP API for
  40 multi-platform shared-memory parallel programming in C/C++ and Fortran.
  41
  42 Published by the Free Software Foundation
  43 51 Franklin Street, Fifth Floor
  44 Boston, MA 02110-1301 USA
  45
  46 @insertcopying
  47 @end ifinfo
  48
  49
  50 @setchapternewpage odd
  51
  52 @titlepage
  53 @title The GNU OpenACC and OpenMP Implementation
  54 @page
  55 @vskip 0pt plus 1filll
  56 @comment For the @value{version-GCC} Version*
  57 @sp 1
  58 Published by the Free Software Foundation @*
  59 51 Franklin Street, Fifth Floor@*
  60 Boston, MA 02110-1301, USA@*
  61 @sp 1
  62 @insertcopying
  63 @end titlepage
  64
  65 @summarycontents
  66 @contents
  67 @page
  68
  69
  70 @node Top
  71 @top Introduction
  72 @cindex Introduction
  73
  74 This manual documents the usage of libgomp, the GNU implementation of the
  75 @uref{http://www.openacc.org/, OpenACC} Application Programming Interface (API)
  76 for offloading of code to accelerator devices in C/C++ and Fortran, and
  77 the GNU implementation of the
  78 @uref{http://www.openmp.org, OpenMP} Application Programming Interface (API)
  79 for multi-platform shared-memory parallel programming in C/C++ and Fortran.
  80
  81
  82
  83 @comment
  84 @comment  When you add a new menu item, please keep the right hand
  85 @comment  aligned to the same column.  Do not use tabs.  This provides
  86 @comment  better formatting.
  87 @comment
  88 @menu
  89 * Enabling OpenACC::                 How to enable OpenACC for your
  90                                      applications.
  91 * OpenACC Runtime Library Routines:: The OpenACC runtime application
  92                                       programming interface.
  93 * OpenACC Environment Variables::    Influencing OpenACC runtime behavior with
  94                                      environment variables.
  95 * OpenACC Library Interoperability:: OpenACC library interoperability with the
  96                                      NVIDIA CUBLAS library.
  97 * Enabling OpenMP::                  How to enable OpenMP for your
  98                                      applications.
  99 * OpenMP Runtime Library Routines: Runtime Library Routines.
 100                                      The OpenMP runtime application programming
 101                                      interface.
 102 * OpenMP Environment Variables: Environment Variables.
 103                                      Influencing OpenMP runtime behavior with
 104                                      environment variables.
 105 * The libgomp ABI::                  Notes on the external libgomp ABI.
 106 * Reporting Bugs::                   How to report bugs.
 107 * Copying::                          GNU general public license says how you
 108                                      can copy and share libgomp.
 109 * GNU Free Documentation License::   How you can copy and share this manual.
 110 * Funding::                          How to help assure continued work for free
 111                                      software.
 112 * Library Index::                    Index of this documentation.
 113 @end menu
 114
 115
 116
 117 @c ---------------------------------------------------------------------
 118 @c Enabling OpenACC
 119 @c ---------------------------------------------------------------------
 120
 121 @node Enabling OpenACC
 122 @chapter Enabling OpenACC
 123
 124 To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
 125 flag @command{-fopenacc} must be specified.  This enables OpenACC, and
 126 arranges for automatic linking of the OpenACC runtime library
 127 (@ref{Runtime Library Routines}).
 128
 129 A complete description of all OpenACC directives accepted may be found in
 130 the @uref{http://www.openacc.org/, OpenMP Application Programming
 131 Interface} manual, version 2.0.
 132
 133
 134 @c ---------------------------------------------------------------------
 135 @c OpenACC Runtime Library Routines
 136 @c ---------------------------------------------------------------------
 137
 138 @node OpenACC Runtime Library Routines
 139 @chapter OpenACC Runtime Library Routines
 140
 141 The runtime routines described here are defined by section 3 of the OpenACC
 142 specifications in version 2.0.
 143 They have C linkage, and do not throw exceptions.
 144 Generally, they are available only for the host, with the exception of
 145 @code{acc_on_device}, which is available for both the host and the
 146 acceleration device.
 147
 148 @menu
 149 * acc_get_num_devices::         Get number of devices for the given device type
 150 * acc_set_device_type::
 151 * acc_get_device_type::
 152 * acc_set_device_num::
 153 * acc_get_device_num::
 154 * acc_init::
 155 * acc_shutdown::
 156 * acc_on_device::               Whether executing on a particular device
 157 * acc_malloc::
 158 * acc_free::
 159 * acc_copyin::
 160 * acc_present_or_copyin::
 161 * acc_create::
 162 * acc_present_or_create::
 163 * acc_copyout::
 164 * acc_delete::
 165 * acc_update_device::
 166 * acc_update_self::
 167 * acc_map_data::
 168 * acc_unmap_data::
 169 * acc_deviceptr::
 170 * acc_hostptr::
 171 * acc_is_present::
 172 * acc_memcpy_to_device::
 173 * acc_memcpy_from_device::
 174 @end menu
 175
 176 API routines for target platforms.
 177
 178 @menu
 179 * acc_get_current_cuda_device::
 180 * acc_get_current_cuda_context::
 181 * acc_get_cuda_stream::
 182 * acc_set_cuda_stream::
 183 @end menu
 184
 185
 186
 187 @node acc_get_num_devices
 188 @section @code{acc_get_num_devices} -- Get number of devices for given device type
 189 @table @asis
 190 @item @emph{Description}
 191 This routine returns a value indicating the
 192 number of devices available for the given device type.  It determines
 193 the number of devices in a @emph{passive} manner.  In other words, it
 194 does not alter the state within the runtime environment aside from
 195 possibly initializing an uninitialized device.  This aspect allows
 196 the routine to be called without concern for altering the interaction
 197 with an attached accelerator device.
 198
 199 @item @emph{Reference}:
 200 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 201 3.2.1.
 202 @end table
 203
 204
 205
 206 @node acc_set_device_type
 207 @section @code{acc_set_device_type}
 208 @table @asis
 209 @item @emph{Reference}:
 210 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 211 3.2.2.
 212 @end table
 213
 214
 215
 216 @node acc_get_device_type
 217 @section @code{acc_get_device_type}
 218 @table @asis
 219 @item @emph{Reference}:
 220 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 221 3.2.3.
 222 @end table
 223
 224
 225
 226 @node acc_set_device_num
 227 @section @code{acc_set_device_num}
 228 @table @asis
 229 @item @emph{Reference}:
 230 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 231 3.2.4.
 232 @end table
 233
 234
 235
 236 @node acc_get_device_num
 237 @section @code{acc_get_device_num}
 238 @table @asis
 239 @item @emph{Reference}:
 240 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 241 3.2.5.
 242 @end table
 243
 244
 245
 246 @node acc_init
 247 @section @code{acc_init}
 248 @table @asis
 249 @item @emph{Reference}:
 250 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 251 3.2.12.
 252 @end table
 253
 254
 255
 256 @node acc_shutdown
 257 @section @code{acc_shutdown}
 258 @table @asis
 259 @item @emph{Reference}:
 260 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 261 3.2.13.
 262 @end table
 263
 264
 265
 266 @node acc_on_device
 267 @section @code{acc_on_device} -- Whether executing on a particular device
 268 @table @asis
 269 @item @emph{Description}:
 270 This routine tells the program whether it is executing on a particular
 271 device.  Based on the argument passed, GCC tries to evaluate this to a
 272 constant at compile time, but library functions are also provided, for
 273 both the host and the acceleration device.
 274
 275 @item @emph{Reference}:
 276 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 277 3.2.14.
 278 @end table
 279
 280
 281
 282 @node acc_malloc
 283 @section @code{acc_malloc}
 284 @table @asis
 285 @item @emph{Reference}:
 286 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 287 3.2.15.
 288 @end table
 289
 290
 291
 292 @node acc_free
 293 @section @code{acc_free}
 294 @table @asis
 295 @item @emph{Reference}:
 296 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 297 3.2.16.
 298 @end table
 299
 300
 301
 302 @node acc_copyin
 303 @section @code{acc_copyin}
 304 @table @asis
 305 @item @emph{Reference}:
 306 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 307 3.2.17.
 308 @end table
 309
 310
 311
 312 @node acc_present_or_copyin
 313 @section @code{acc_present_or_copyin}
 314 @table @asis
 315 @item @emph{Reference}:
 316 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 317 3.2.18.
 318 @end table
 319
 320
 321
 322 @node acc_create
 323 @section @code{acc_create}
 324 @table @asis
 325 @item @emph{Reference}:
 326 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 327 3.2.19.
 328 @end table
 329
 330
 331
 332 @node acc_present_or_create
 333 @section @code{acc_present_or_create}
 334 @table @asis
 335 @item @emph{Reference}:
 336 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 337 3.2.20.
 338 @end table
 339
 340
 341
 342 @node acc_copyout
 343 @section @code{acc_copyout}
 344 @table @asis
 345 @item @emph{Reference}:
 346 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 347 3.2.21.
 348 @end table
 349
 350
 351
 352 @node acc_delete
 353 @section @code{acc_delete}
 354 @table @asis
 355 @item @emph{Reference}:
 356 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 357 3.2.22.
 358 @end table
 359
 360
 361
 362 @node acc_update_device
 363 @section @code{acc_update_device}
 364 @table @asis
 365 @item @emph{Reference}:
 366 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 367 3.2.23.
 368 @end table
 369
 370
 371
 372 @node acc_update_self
 373 @section @code{acc_update_self}
 374 @table @asis
 375 @item @emph{Reference}:
 376 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 377 3.2.24.
 378 @end table
 379
 380
 381
 382 @node acc_map_data
 383 @section @code{acc_map_data}
 384 @table @asis
 385 @item @emph{Reference}:
 386 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 387 3.2.25.
 388 @end table
 389
 390
 391
 392 @node acc_unmap_data
 393 @section @code{acc_unmap_data}
 394 @table @asis
 395 @item @emph{Reference}:
 396 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 397 3.2.26.
 398 @end table
 399
 400
 401
 402 @node acc_deviceptr
 403 @section @code{acc_deviceptr}
 404 @table @asis
 405 @item @emph{Reference}:
 406 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 407 3.2.27.
 408 @end table
 409
 410
 411
 412 @node acc_hostptr
 413 @section @code{acc_hostptr}
 414 @table @asis
 415 @item @emph{Reference}:
 416 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 417 3.2.28.
 418 @end table
 419
 420
 421
 422 @node acc_is_present
 423 @section @code{acc_is_present}
 424 @table @asis
 425 @item @emph{Reference}:
 426 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 427 3.2.29.
 428 @end table
 429
 430
 431
 432 @node acc_memcpy_to_device
 433 @section @code{acc_memcpy_to_device}
 434 @table @asis
 435 @item @emph{Reference}:
 436 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 437 3.2.30.
 438 @end table
 439
 440
 441
 442 @node acc_memcpy_from_device
 443 @section @code{acc_memcpy_from_device}
 444 @table @asis
 445 @item @emph{Reference}:
 446 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 447 3.2.31.
 448 @end table
 449
 450
 451
 452 @node acc_get_current_cuda_device
 453 @section @code{acc_get_current_cuda_device}
 454 @table @asis
 455 @item @emph{Reference}:
 456 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 457 A.2.1.1.
 458 @end table
 459
 460
 461
 462 @node acc_get_current_cuda_context
 463 @section @code{acc_get_current_cuda_context}
 464 @table @asis
 465 @item @emph{Reference}:
 466 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 467 A.2.1.2.
 468 @end table
 469
 470
 471
 472 @node acc_get_cuda_stream
 473 @section @code{acc_get_cuda_stream}
 474 @table @asis
 475 @item @emph{Reference}:
 476 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 477 A.2.1.3.
 478 @end table
 479
 480
 481
 482 @node acc_set_cuda_stream
 483 @section @code{acc_set_cuda_stream}
 484 @table @asis
 485 @item @emph{Reference}:
 486 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 487 A.2.1.4.
 488 @end table
 489
 490
 491
 492 @c ---------------------------------------------------------------------
 493 @c OpenACC Environment Variables
 494 @c ---------------------------------------------------------------------
 495
 496 @node OpenACC Environment Variables
 497 @chapter OpenACC Environment Variables
 498
 499 The variables @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}
 500 are defined by section 4 of the OpenACC specification in version 2.0.
 501 The variable @env{GCC_ACC_NOTIFY} is used for diagnostic purposes.
 502
 503 @menu
 504 * ACC_DEVICE_TYPE::
 505 * ACC_DEVICE_NUM::
 506 * GCC_ACC_NOTIFY::
 507 @end menu
 508
 509
 510
 511 @node ACC_DEVICE_TYPE
 512 @section @code{ACC_DEVICE_TYPE}
 513 @table @asis
 514 @item @emph{Reference}:
 515 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 516 4.1.
 517 @end table
 518
 519
 520
 521 @node ACC_DEVICE_NUM
 522 @section @code{ACC_DEVICE_NUM}
 523 @table @asis
 524 @item @emph{Reference}:
 525 @uref{http://www.openacc.org/, OpenACC specification v2.0}, section
 526 4.2.
 527 @end table
 528
 529
 530
 531 @node GCC_ACC_NOTIFY
 532 @section @code{GCC_ACC_NOTIFY}
 533 @table @asis
 534 @item @emph{Description}:
 535 Print debug information pertaining to the accelerator.
 536 @end table
 537
 538
 539 @c ---------------------------------------------------------------------
 540 @c OpenACC Library Interoperability
 541 @c ---------------------------------------------------------------------
 542
 543 @node OpenACC Library Interoperability
 544 @chapter OpenACC Library Interoperability
 545
 546 @section Introduction
 547
 548 As the OpenACC library is built using the CUDA Driver API, the question has
 549 arisen on what impact does using the OpenACC library have on a program that
 550 uses the Runtime library, or a library based on the Runtime library, e.g.,
 551 CUBLAS@footnote{Seee section 2.26, "Interactions with the CUDA Driver API" in
 552 "CUDA Runtime API", Version 5.5, July 2013 and section 2.27, "VDPAU
 553 Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
 554 July 2013, for additional information on library interoperability.}.
 555 This chapter will describe the use cases and what changes are
 556 required in order to use both the OpenACC library and the CUBLAS and Runtime
 557 libraries within a program.
 558
 559 @section First invocation: NVIDIA CUBLAS library API
 560
 561 In this first use case (see below), a function in the CUBLAS library is called
 562 prior to any of the functions in the OpenACC library. More specifically, the
 563 function @code{cublasCreate()}.
 564
 565 When invoked, the function will initialize the library and allocate the
 566 hardware resources on the host and the device on behalf of the caller. Once
 567 the initialization and allocation has completed, a handle is returned to the
 568 caller. The OpenACC library also requires initialization and allocation of
 569 hardware resources. Since the CUBLAS library has already allocated the
 570 hardware resources for the device, all that is left to do is to initialize
 571 the OpenACC library and acquire the hardware resources on the host.
 572
 573 Prior to calling the OpenACC function that will initialize the library and
 574 allocate the host hardware resources, one needs to acquire the device number
 575 that was allocated during the call to @code{cublasCreate()}. The invoking of the
 576 runtime library function @code{cudaGetDevice()} will accomplish this. Once
 577 acquired, the device number is passed along with the device type as
 578 parameters to the OpenACC library function @code{acc_set_device_num()}.
 579
 580 Once the call to @code{acc_set_device_num()} has completed, the OpenACC
 581 library will be using the  context that was created during the call to
 582 @code{cublasCreate()}. In other words, both libraries will be sharing the
 583 same context.
 584
 585 @verbatim
 586     /* Create the handle */
 587     s = cublasCreate(&h);
 588     if (s != CUBLAS_STATUS_SUCCESS)
 589     {
 590         fprintf(stderr, "cublasCreate failed %d\n", s);
 591         exit(EXIT_FAILURE);
 592     }
 593
 594     /* Get the device number */
 595     e = cudaGetDevice(&dev);
 596     if (e != cudaSuccess)
 597     {
 598         fprintf(stderr, "cudaGetDevice failed %d\n", e);
 599         exit(EXIT_FAILURE);
 600     }
 601
 602     /* Initialize OpenACC library and use device 'dev' */
 603     acc_set_device_num(dev, acc_device_nvidia);
 604
 605 @end verbatim
 606 @center Use Case 1
 607
 608 @section First invocation: OpenACC library API
 609
 610 In this second use case (see below), a function in the OpenACC library is
 611 called prior to any of the functions in the CUBLAS library. More specificially,
 612 the function acc_set_device_num().
 613
 614 In the use case presented here, the function @code{acc_set_device_num()}
 615 is used to both initialize the OpenACC library and allocate the hardware
 616 resources on the host and the device. In the call to the function, the
 617 call parameters specify which device to use, i.e., 'dev', and what device
 618 type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
 619 is but one method to initialize the OpenACC library and allocate the
 620 appropriate hardware resources. Other methods are available through the
 621 use of environment variables and these will be discussed in the next section.
 622
 623 Once the call to @code{acc_set_device_num()} has completed, other OpenACC
 624 functions can be called as seen with multiple calls being made to
 625 @code{acc_copyin()}. In addition, calls can be made to functions in the
 626 CUBLAS library. In the use case a call to @code{cublasCreate()} is made
 627 subsequent to the calls to @code{acc_copyin()}.
 628 As seen in the previous use case, a call to @code{cublasCreate()} will
 629 initialize the CUBLAS library and allocate the hardware resources on the
 630 host and the device.  However, since the device has already been allocated,
 631 @code{cublasCreate()} will only initialize the CUBLAS library and allocate
 632 the appropriate hardware resources on the host. The context that was created
 633 as part of the OpenACC initialization will be shared with the CUBLAS library,
 634 similarly to the first use case.
 635
 636 @verbatim
 637     dev = 0;
 638
 639     acc_set_device_num(dev, acc_device_nvidia);
 640
 641     /* Copy the first set to the device */
 642     d_X = acc_copyin(&h_X[0], N * sizeof (float));
 643     if (d_X == NULL)
 644     {
 645         fprintf(stderr, "copyin error h_X\n");
 646         exit(EXIT_FAILURE);
 647     }
 648
 649     /* Copy the second set to the device */
 650     d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
 651     if (d_Y == NULL)
 652     {
 653         fprintf(stderr, "copyin error h_Y1\n");
 654         exit(EXIT_FAILURE);
 655     }
 656
 657     /* Create the handle */
 658     s = cublasCreate(&h);
 659     if (s != CUBLAS_STATUS_SUCCESS)
 660     {
 661         fprintf(stderr, "cublasCreate failed %d\n", s);
 662         exit(EXIT_FAILURE);
 663     }
 664
 665     /* Perform saxpy using CUBLAS library function */
 666     s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
 667     if (s != CUBLAS_STATUS_SUCCESS)
 668     {
 669         fprintf(stderr, "cublasSaxpy failed %d\n", s);
 670         exit(EXIT_FAILURE);
 671     }
 672
 673     /* Copy the results from the device */
 674     acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
 675
 676 }
 677 @end verbatim
 678 @center Use Case 2
 679
 680 @section OpenACC library and environment variables
 681
 682 There are two environment variables associated with the OpenACC library that
 683 may be used to control the device type and device number.
 684 Namely, @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}. In the second
 685 use case, the device type and device number were specified using
 686 @code{acc_set_device_num()}. However, @env{ACC_DEVICE_TYPE} and
 687 @env{ACC_DEVICE_NUM} could have been defined and the call to
 688 @code{acc_set_device_num()} would be not be required. At the time of the
 689 call to @code{acc_copyin()}, these two environment variables would be
 690 sampled and their values used.
 691
 692 The use of the environment variables is only relevant when an OpenACC function
 693 is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
 694 is called prior to a call to an OpenACC function, then a call to
 695 @code{acc_set_device_num()}, must be done@footnote{More complete information
 696 about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
 697 sections 4.1 and 4.2 of the “The OpenACC
 698 Application Programming Interface”, Version 2.0, June, 2013.}.
 699
 700
 701
 702 @c ---------------------------------------------------------------------
 703 @c Enabling OpenMP
 704 @c ---------------------------------------------------------------------
 705
 706 @node Enabling OpenMP
 707 @chapter Enabling OpenMP
 708
 709 To activate the OpenMP extensions for C/C++ and Fortran, the compile-time
 710 flag @command{-fopenmp} must be specified.  This enables the OpenMP directive
 711 @code{#pragma omp} in C/C++ and @code{!$omp} directives in free form,
 712 @code{c$omp}, @code{*$omp} and @code{!$omp} directives in fixed form,
 713 @code{!$} conditional compilation sentinels in free form and @code{c$},
 714 @code{*$} and @code{!$} sentinels in fixed form, for Fortran.  The flag also
 715 arranges for automatic linking of the OpenMP runtime library
 716 (@ref{Runtime Library Routines}).
 717
 718 A complete description of all OpenMP directives accepted may be found in
 719 the @uref{http://www.openmp.org, OpenMP Application Program Interface} manual,
 720 version 4.0.
 721
 722
 723 @c ---------------------------------------------------------------------
 724 @c OpenMP Runtime Library Routines
 725 @c ---------------------------------------------------------------------
 726
 727 @node Runtime Library Routines
 728 @chapter OpenMP Runtime Library Routines
 729
 730 The runtime routines described here are defined by Section 3 of the OpenMP
 731 specification in version 4.0.  The routines are structured in following
 732 three parts:
 733
 734 @menu
 735 Control threads, processors and the parallel environment.  They have C
 736 linkage, and do not throw exceptions.
 737
 738 * omp_get_active_level::        Number of active parallel regions
 739 * omp_get_ancestor_thread_num:: Ancestor thread ID
 740 * omp_get_cancellation::        Whether cancellation support is enabled
 741 * omp_get_default_device::      Get the default device for target regions
 742 * omp_get_dynamic::             Dynamic teams setting
 743 * omp_get_level::               Number of parallel regions
 744 * omp_get_max_active_levels::   Maximum number of active regions
 745 * omp_get_max_threads::         Maximum number of threads of parallel region
 746 * omp_get_nested::              Nested parallel regions
 747 * omp_get_num_devices::         Number of target devices
 748 * omp_get_num_procs::           Number of processors online
 749 * omp_get_num_teams::           Number of teams
 750 * omp_get_num_threads::         Size of the active team
 751 * omp_get_proc_bind::           Whether theads may be moved between CPUs
 752 * omp_get_schedule::            Obtain the runtime scheduling method
 753 * omp_get_team_num::            Get team number
 754 * omp_get_team_size::           Number of threads in a team
 755 * omp_get_thread_limit::        Maximum number of threads
 756 * omp_get_thread_num::          Current thread ID
 757 * omp_in_parallel::             Whether a parallel region is active
 758 * omp_in_final::                Whether in final or included task region
 759 * omp_is_initial_device::       Whether executing on the host device
 760 * omp_set_default_device::      Set the default device for target regions
 761 * omp_set_dynamic::             Enable/disable dynamic teams
 762 * omp_set_max_active_levels::   Limits the number of active parallel regions
 763 * omp_set_nested::              Enable/disable nested parallel regions
 764 * omp_set_num_threads::         Set upper team size limit
 765 * omp_set_schedule::            Set the runtime scheduling method
 766
 767 Initialize, set, test, unset and destroy simple and nested locks.
 768
 769 * omp_init_lock::            Initialize simple lock
 770 * omp_set_lock::             Wait for and set simple lock
 771 * omp_test_lock::            Test and set simple lock if available
 772 * omp_unset_lock::           Unset simple lock
 773 * omp_destroy_lock::         Destroy simple lock
 774 * omp_init_nest_lock::       Initialize nested lock
 775 * omp_set_nest_lock::        Wait for and set simple lock
 776 * omp_test_nest_lock::       Test and set nested lock if available
 777 * omp_unset_nest_lock::      Unset nested lock
 778 * omp_destroy_nest_lock::    Destroy nested lock
 779
 780 Portable, thread-based, wall clock timer.
 781
 782 * omp_get_wtick::            Get timer precision.
 783 * omp_get_wtime::            Elapsed wall clock time.
 784 @end menu
 785
 786
 787
 788 @node omp_get_active_level
 789 @section @code{omp_get_active_level} -- Number of parallel regions
 790 @table @asis
 791 @item @emph{Description}:
 792 This function returns the nesting level for the active parallel blocks,
 793 which enclose the calling call.
 794
 795 @item @emph{C/C++}
 796 @multitable @columnfractions .20 .80
 797 @item @emph{Prototype}: @tab @code{int omp_get_active_level(void);}
 798 @end multitable
 799
 800 @item @emph{Fortran}:
 801 @multitable @columnfractions .20 .80
 802 @item @emph{Interface}: @tab @code{integer function omp_get_active_level()}
 803 @end multitable
 804
 805 @item @emph{See also}:
 806 @ref{omp_get_level}, @ref{omp_get_max_active_levels}, @ref{omp_set_max_active_levels}
 807
 808 @item @emph{Reference}:
 809 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.20.
 810 @end table
 811
 812
 813
 814 @node omp_get_ancestor_thread_num
 815 @section @code{omp_get_ancestor_thread_num} -- Ancestor thread ID
 816 @table @asis
 817 @item @emph{Description}:
 818 This function returns the thread identification number for the given
 819 nesting level of the current thread.  For values of @var{level} outside
 820 zero to @code{omp_get_level} -1 is returned; if @var{level} is
 821 @code{omp_get_level} the result is identical to @code{omp_get_thread_num}.
 822
 823 @item @emph{C/C++}
 824 @multitable @columnfractions .20 .80
 825 @item @emph{Prototype}: @tab @code{int omp_get_ancestor_thread_num(int level);}
 826 @end multitable
 827
 828 @item @emph{Fortran}:
 829 @multitable @columnfractions .20 .80
 830 @item @emph{Interface}: @tab @code{integer function omp_get_ancestor_thread_num(level)}
 831 @item                   @tab @code{integer level}
 832 @end multitable
 833
 834 @item @emph{See also}:
 835 @ref{omp_get_level}, @ref{omp_get_thread_num}, @ref{omp_get_team_size}
 836
 837 @item @emph{Reference}:
 838 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.18.
 839 @end table
 840
 841
 842
 843 @node omp_get_cancellation
 844 @section @code{omp_get_cancellation} -- Whether cancellation support is enabled
 845 @table @asis
 846 @item @emph{Description}:
 847 This function returns @code{true} if cancellation is activated, @code{false}
 848 otherwise.  Here, @code{true} and @code{false} represent their language-specific
 849 counterparts.  Unless @env{OMP_CANCELLATION} is set true, cancellations are
 850 deactivated.
 851
 852 @item @emph{C/C++}:
 853 @multitable @columnfractions .20 .80
 854 @item @emph{Prototype}: @tab @code{int omp_get_cancellation(void);}
 855 @end multitable
 856
 857 @item @emph{Fortran}:
 858 @multitable @columnfractions .20 .80
 859 @item @emph{Interface}: @tab @code{logical function omp_get_cancellation()}
 860 @end multitable
 861
 862 @item @emph{See also}:
 863 @ref{OMP_CANCELLATION}
 864
 865 @item @emph{Reference}:
 866 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.9.
 867 @end table
 868
 869
 870
 871 @node omp_get_default_device
 872 @section @code{omp_get_default_device} -- Get the default device for target regions
 873 @table @asis
 874 @item @emph{Description}:
 875 Get the default device for target regions without device clause.
 876
 877 @item @emph{C/C++}:
 878 @multitable @columnfractions .20 .80
 879 @item @emph{Prototype}: @tab @code{int omp_get_default_device(void);}
 880 @end multitable
 881
 882 @item @emph{Fortran}:
 883 @multitable @columnfractions .20 .80
 884 @item @emph{Interface}: @tab @code{integer function omp_get_default_device()}
 885 @end multitable
 886
 887 @item @emph{See also}:
 888 @ref{OMP_DEFAULT_DEVICE}, @ref{omp_set_default_device}
 889
 890 @item @emph{Reference}:
 891 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.24.
 892 @end table
 893
 894
 895
 896 @node omp_get_dynamic
 897 @section @code{omp_get_dynamic} -- Dynamic teams setting
 898 @table @asis
 899 @item @emph{Description}:
 900 This function returns @code{true} if enabled, @code{false} otherwise.
 901 Here, @code{true} and @code{false} represent their language-specific
 902 counterparts.
 903
 904 The dynamic team setting may be initialized at startup by the
 905 @env{OMP_DYNAMIC} environment variable or at runtime using
 906 @code{omp_set_dynamic}.  If undefined, dynamic adjustment is
 907 disabled by default.
 908
 909 @item @emph{C/C++}:
 910 @multitable @columnfractions .20 .80
 911 @item @emph{Prototype}: @tab @code{int omp_get_dynamic(void);}
 912 @end multitable
 913
 914 @item @emph{Fortran}:
 915 @multitable @columnfractions .20 .80
 916 @item @emph{Interface}: @tab @code{logical function omp_get_dynamic()}
 917 @end multitable
 918
 919 @item @emph{See also}:
 920 @ref{omp_set_dynamic}, @ref{OMP_DYNAMIC}
 921
 922 @item @emph{Reference}:
 923 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.8.
 924 @end table
 925
 926
 927
 928 @node omp_get_level
 929 @section @code{omp_get_level} -- Obtain the current nesting level
 930 @table @asis
 931 @item @emph{Description}:
 932 This function returns the nesting level for the parallel blocks,
 933 which enclose the calling call.
 934
 935 @item @emph{C/C++}
 936 @multitable @columnfractions .20 .80
 937 @item @emph{Prototype}: @tab @code{int omp_get_level(void);}
 938 @end multitable
 939
 940 @item @emph{Fortran}:
 941 @multitable @columnfractions .20 .80
 942 @item @emph{Interface}: @tab @code{integer function omp_level()}
 943 @end multitable
 944
 945 @item @emph{See also}:
 946 @ref{omp_get_active_level}
 947
 948 @item @emph{Reference}:
 949 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.17.
 950 @end table
 951
 952
 953
 954 @node omp_get_max_active_levels
 955 @section @code{omp_get_max_active_levels} -- Maximum number of active regions
 956 @table @asis
 957 @item @emph{Description}:
 958 This function obtains the maximum allowed number of nested, active parallel regions.
 959
 960 @item @emph{C/C++}
 961 @multitable @columnfractions .20 .80
 962 @item @emph{Prototype}: @tab @code{int omp_get_max_active_levels(void);}
 963 @end multitable
 964
 965 @item @emph{Fortran}:
 966 @multitable @columnfractions .20 .80
 967 @item @emph{Interface}: @tab @code{integer function omp_get_max_active_levels()}
 968 @end multitable
 969
 970 @item @emph{See also}:
 971 @ref{omp_set_max_active_levels}, @ref{omp_get_active_level}
 972
 973 @item @emph{Reference}:
 974 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.16.
 975 @end table
 976
 977
 978
 979 @node omp_get_max_threads
 980 @section @code{omp_get_max_threads} -- Maximum number of threads of parallel region
 981 @table @asis
 982 @item @emph{Description}:
 983 Return the maximum number of threads used for the current parallel region
 984 that does not use the clause @code{num_threads}.
 985
 986 @item @emph{C/C++}:
 987 @multitable @columnfractions .20 .80
 988 @item @emph{Prototype}: @tab @code{int omp_get_max_threads(void);}
 989 @end multitable
 990
 991 @item @emph{Fortran}:
 992 @multitable @columnfractions .20 .80
 993 @item @emph{Interface}: @tab @code{integer function omp_get_max_threads()}
 994 @end multitable
 995
 996 @item @emph{See also}:
 997 @ref{omp_set_num_threads}, @ref{omp_set_dynamic}, @ref{omp_get_thread_limit}
 998
 999 @item @emph{Reference}:
1000 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.3.
1001 @end table
1002
1003
1004
1005 @node omp_get_nested
1006 @section @code{omp_get_nested} -- Nested parallel regions
1007 @table @asis
1008 @item @emph{Description}:
1009 This function returns @code{true} if nested parallel regions are
1010 enabled, @code{false} otherwise.  Here, @code{true} and @code{false}
1011 represent their language-specific counterparts.
1012
1013 Nested parallel regions may be initialized at startup by the
1014 @env{OMP_NESTED} environment variable or at runtime using
1015 @code{omp_set_nested}.  If undefined, nested parallel regions are
1016 disabled by default.
1017
1018 @item @emph{C/C++}:
1019 @multitable @columnfractions .20 .80
1020 @item @emph{Prototype}: @tab @code{int omp_get_nested(void);}
1021 @end multitable
1022
1023 @item @emph{Fortran}:
1024 @multitable @columnfractions .20 .80
1025 @item @emph{Interface}: @tab @code{logical function omp_get_nested()}
1026 @end multitable
1027
1028 @item @emph{See also}:
1029 @ref{omp_set_nested}, @ref{OMP_NESTED}
1030
1031 @item @emph{Reference}:
1032 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.11.
1033 @end table
1034
1035
1036
1037 @node omp_get_num_devices
1038 @section @code{omp_get_num_devices} -- Number of target devices
1039 @table @asis
1040 @item @emph{Description}:
1041 Returns the number of target devices.
1042
1043 @item @emph{C/C++}:
1044 @multitable @columnfractions .20 .80
1045 @item @emph{Prototype}: @tab @code{int omp_get_num_devices(void);}
1046 @end multitable
1047
1048 @item @emph{Fortran}:
1049 @multitable @columnfractions .20 .80
1050 @item @emph{Interface}: @tab @code{integer function omp_get_num_devices()}
1051 @end multitable
1052
1053 @item @emph{Reference}:
1054 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.25.
1055 @end table
1056
1057
1058
1059 @node omp_get_num_procs
1060 @section @code{omp_get_num_procs} -- Number of processors online
1061 @table @asis
1062 @item @emph{Description}:
1063 Returns the number of processors online on that device.
1064
1065 @item @emph{C/C++}:
1066 @multitable @columnfractions .20 .80
1067 @item @emph{Prototype}: @tab @code{int omp_get_num_procs(void);}
1068 @end multitable
1069
1070 @item @emph{Fortran}:
1071 @multitable @columnfractions .20 .80
1072 @item @emph{Interface}: @tab @code{integer function omp_get_num_procs()}
1073 @end multitable
1074
1075 @item @emph{Reference}:
1076 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.5.
1077 @end table
1078
1079
1080
1081 @node omp_get_num_teams
1082 @section @code{omp_get_num_teams} -- Number of teams
1083 @table @asis
1084 @item @emph{Description}:
1085 Returns the number of teams in the current team region.
1086
1087 @item @emph{C/C++}:
1088 @multitable @columnfractions .20 .80
1089 @item @emph{Prototype}: @tab @code{int omp_get_num_teams(void);}
1090 @end multitable
1091
1092 @item @emph{Fortran}:
1093 @multitable @columnfractions .20 .80
1094 @item @emph{Interface}: @tab @code{integer function omp_get_num_teams()}
1095 @end multitable
1096
1097 @item @emph{Reference}:
1098 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.26.
1099 @end table
1100
1101
1102
1103 @node omp_get_num_threads
1104 @section @code{omp_get_num_threads} -- Size of the active team
1105 @table @asis
1106 @item @emph{Description}:
1107 Returns the number of threads in the current team.  In a sequential section of
1108 the program @code{omp_get_num_threads} returns 1.
1109
1110 The default team size may be initialized at startup by the
1111 @env{OMP_NUM_THREADS} environment variable.  At runtime, the size
1112 of the current team may be set either by the @code{NUM_THREADS}
1113 clause or by @code{omp_set_num_threads}.  If none of the above were
1114 used to define a specific value and @env{OMP_DYNAMIC} is disabled,
1115 one thread per CPU online is used.
1116
1117 @item @emph{C/C++}:
1118 @multitable @columnfractions .20 .80
1119 @item @emph{Prototype}: @tab @code{int omp_get_num_threads(void);}
1120 @end multitable
1121
1122 @item @emph{Fortran}:
1123 @multitable @columnfractions .20 .80
1124 @item @emph{Interface}: @tab @code{integer function omp_get_num_threads()}
1125 @end multitable
1126
1127 @item @emph{See also}:
1128 @ref{omp_get_max_threads}, @ref{omp_set_num_threads}, @ref{OMP_NUM_THREADS}
1129
1130 @item @emph{Reference}:
1131 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.2.
1132 @end table
1133
1134
1135
1136 @node omp_get_proc_bind
1137 @section @code{omp_get_proc_bind} -- Whether theads may be moved between CPUs
1138 @table @asis
1139 @item @emph{Description}:
1140 This functions returns the currently active thread affinity policy, which is
1141 set via @env{OMP_PROC_BIND}.  Possible values are @code{omp_proc_bind_false},
1142 @code{omp_proc_bind_true}, @code{omp_proc_bind_master},
1143 @code{omp_proc_bind_close} and @code{omp_proc_bind_spread}.
1144
1145 @item @emph{C/C++}:
1146 @multitable @columnfractions .20 .80
1147 @item @emph{Prototype}: @tab @code{omp_proc_bind_t omp_get_proc_bind(void);}
1148 @end multitable
1149
1150 @item @emph{Fortran}:
1151 @multitable @columnfractions .20 .80
1152 @item @emph{Interface}: @tab @code{integer(kind=omp_proc_bind_kind) function omp_get_proc_bind()}
1153 @end multitable
1154
1155 @item @emph{See also}:
1156 @ref{OMP_PROC_BIND}, @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY},
1157
1158 @item @emph{Reference}:
1159 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.22.
1160 @end table
1161
1162
1163
1164 @node omp_get_schedule
1165 @section @code{omp_get_schedule} -- Obtain the runtime scheduling method
1166 @table @asis
1167 @item @emph{Description}:
1168 Obtain the runtime scheduling method.  The @var{kind} argument will be
1169 set to the value @code{omp_sched_static}, @code{omp_sched_dynamic},
1170 @code{omp_sched_guided} or @code{omp_sched_auto}.  The second argument,
1171 @var{modifier}, is set to the chunk size.
1172
1173 @item @emph{C/C++}
1174 @multitable @columnfractions .20 .80
1175 @item @emph{Prototype}: @tab @code{void omp_get_schedule(omp_sched_t *kind, int *modifier);}
1176 @end multitable
1177
1178 @item @emph{Fortran}:
1179 @multitable @columnfractions .20 .80
1180 @item @emph{Interface}: @tab @code{subroutine omp_get_schedule(kind, modifier)}
1181 @item                   @tab @code{integer(kind=omp_sched_kind) kind}
1182 @item                   @tab @code{integer modifier}
1183 @end multitable
1184
1185 @item @emph{See also}:
1186 @ref{omp_set_schedule}, @ref{OMP_SCHEDULE}
1187
1188 @item @emph{Reference}:
1189 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.13.
1190 @end table
1191
1192
1193
1194 @node omp_get_team_num
1195 @section @code{omp_get_team_num} -- Get team number
1196 @table @asis
1197 @item @emph{Description}:
1198 Returns the team number of the calling thread.
1199
1200 @item @emph{C/C++}:
1201 @multitable @columnfractions .20 .80
1202 @item @emph{Prototype}: @tab @code{int omp_get_team_num(void);}
1203 @end multitable
1204
1205 @item @emph{Fortran}:
1206 @multitable @columnfractions .20 .80
1207 @item @emph{Interface}: @tab @code{integer function omp_get_team_num()}
1208 @end multitable
1209
1210 @item @emph{Reference}:
1211 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.27.
1212 @end table
1213
1214
1215
1216 @node omp_get_team_size
1217 @section @code{omp_get_team_size} -- Number of threads in a team
1218 @table @asis
1219 @item @emph{Description}:
1220 This function returns the number of threads in a thread team to which
1221 either the current thread or its ancestor belongs.  For values of @var{level}
1222 outside zero to @code{omp_get_level}, -1 is returned; if @var{level} is zero,
1223 1 is returned, and for @code{omp_get_level}, the result is identical
1224 to @code{omp_get_num_threads}.
1225
1226 @item @emph{C/C++}:
1227 @multitable @columnfractions .20 .80
1228 @item @emph{Prototype}: @tab @code{int omp_get_team_size(int level);}
1229 @end multitable
1230
1231 @item @emph{Fortran}:
1232 @multitable @columnfractions .20 .80
1233 @item @emph{Interface}: @tab @code{integer function omp_get_team_size(level)}
1234 @item                   @tab @code{integer level}
1235 @end multitable
1236
1237 @item @emph{See also}:
1238 @ref{omp_get_num_threads}, @ref{omp_get_level}, @ref{omp_get_ancestor_thread_num}
1239
1240 @item @emph{Reference}:
1241 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.19.
1242 @end table
1243
1244
1245
1246 @node omp_get_thread_limit
1247 @section @code{omp_get_thread_limit} -- Maximum number of threads
1248 @table @asis
1249 @item @emph{Description}:
1250 Return the maximum number of threads of the program.
1251
1252 @item @emph{C/C++}:
1253 @multitable @columnfractions .20 .80
1254 @item @emph{Prototype}: @tab @code{int omp_get_thread_limit(void);}
1255 @end multitable
1256
1257 @item @emph{Fortran}:
1258 @multitable @columnfractions .20 .80
1259 @item @emph{Interface}: @tab @code{integer function omp_get_thread_limit()}
1260 @end multitable
1261
1262 @item @emph{See also}:
1263 @ref{omp_get_max_threads}, @ref{OMP_THREAD_LIMIT}
1264
1265 @item @emph{Reference}:
1266 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.14.
1267 @end table
1268
1269
1270
1271 @node omp_get_thread_num
1272 @section @code{omp_get_thread_num} -- Current thread ID
1273 @table @asis
1274 @item @emph{Description}:
1275 Returns a unique thread identification number within the current team.
1276 In a sequential parts of the program, @code{omp_get_thread_num}
1277 always returns 0.  In parallel regions the return value varies
1278 from 0 to @code{omp_get_num_threads}-1 inclusive.  The return
1279 value of the master thread of a team is always 0.
1280
1281 @item @emph{C/C++}:
1282 @multitable @columnfractions .20 .80
1283 @item @emph{Prototype}: @tab @code{int omp_get_thread_num(void);}
1284 @end multitable
1285
1286 @item @emph{Fortran}:
1287 @multitable @columnfractions .20 .80
1288 @item @emph{Interface}: @tab @code{integer function omp_get_thread_num()}
1289 @end multitable
1290
1291 @item @emph{See also}:
1292 @ref{omp_get_num_threads}, @ref{omp_get_ancestor_thread_num}
1293
1294 @item @emph{Reference}:
1295 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.4.
1296 @end table
1297
1298
1299
1300 @node omp_in_parallel
1301 @section @code{omp_in_parallel} -- Whether a parallel region is active
1302 @table @asis
1303 @item @emph{Description}:
1304 This function returns @code{true} if currently running in parallel,
1305 @code{false} otherwise.  Here, @code{true} and @code{false} represent
1306 their language-specific counterparts.
1307
1308 @item @emph{C/C++}:
1309 @multitable @columnfractions .20 .80
1310 @item @emph{Prototype}: @tab @code{int omp_in_parallel(void);}
1311 @end multitable
1312
1313 @item @emph{Fortran}:
1314 @multitable @columnfractions .20 .80
1315 @item @emph{Interface}: @tab @code{logical function omp_in_parallel()}
1316 @end multitable
1317
1318 @item @emph{Reference}:
1319 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.6.
1320 @end table
1321
1322
1323 @node omp_in_final
1324 @section @code{omp_in_final} -- Whether in final or included task region
1325 @table @asis
1326 @item @emph{Description}:
1327 This function returns @code{true} if currently running in a final
1328 or included task region, @code{false} otherwise.  Here, @code{true}
1329 and @code{false} represent their language-specific counterparts.
1330
1331 @item @emph{C/C++}:
1332 @multitable @columnfractions .20 .80
1333 @item @emph{Prototype}: @tab @code{int omp_in_final(void);}
1334 @end multitable
1335
1336 @item @emph{Fortran}:
1337 @multitable @columnfractions .20 .80
1338 @item @emph{Interface}: @tab @code{logical function omp_in_final()}
1339 @end multitable
1340
1341 @item @emph{Reference}:
1342 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.21.
1343 @end table
1344
1345
1346
1347 @node omp_is_initial_device
1348 @section @code{omp_is_initial_device} -- Whether executing on the host device
1349 @table @asis
1350 @item @emph{Description}:
1351 This function returns @code{true} if currently running on the host device,
1352 @code{false} otherwise.  Here, @code{true} and @code{false} represent
1353 their language-specific counterparts.
1354
1355 @item @emph{C/C++}:
1356 @multitable @columnfractions .20 .80
1357 @item @emph{Prototype}: @tab @code{int omp_is_initial_device(void);}
1358 @end multitable
1359
1360 @item @emph{Fortran}:
1361 @multitable @columnfractions .20 .80
1362 @item @emph{Interface}: @tab @code{logical function omp_is_initial_device()}
1363 @end multitable
1364
1365 @item @emph{Reference}:
1366 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.28.
1367 @end table
1368
1369
1370
1371 @node omp_set_default_device
1372 @section @code{omp_set_default_device} -- Set the default device for target regions
1373 @table @asis
1374 @item @emph{Description}:
1375 Set the default device for target regions without device clause.  The argument
1376 shall be a nonnegative device number.
1377
1378 @item @emph{C/C++}:
1379 @multitable @columnfractions .20 .80
1380 @item @emph{Prototype}: @tab @code{void omp_set_default_device(int device_num);}
1381 @end multitable
1382
1383 @item @emph{Fortran}:
1384 @multitable @columnfractions .20 .80
1385 @item @emph{Interface}: @tab @code{subroutine omp_set_default_device(device_num)}
1386 @item                   @tab @code{integer device_num}
1387 @end multitable
1388
1389 @item @emph{See also}:
1390 @ref{OMP_DEFAULT_DEVICE}, @ref{omp_get_default_device}
1391
1392 @item @emph{Reference}:
1393 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.23.
1394 @end table
1395
1396
1397
1398 @node omp_set_dynamic
1399 @section @code{omp_set_dynamic} -- Enable/disable dynamic teams
1400 @table @asis
1401 @item @emph{Description}:
1402 Enable or disable the dynamic adjustment of the number of threads
1403 within a team.  The function takes the language-specific equivalent
1404 of @code{true} and @code{false}, where @code{true} enables dynamic
1405 adjustment of team sizes and @code{false} disables it.
1406
1407 @item @emph{C/C++}:
1408 @multitable @columnfractions .20 .80
1409 @item @emph{Prototype}: @tab @code{void omp_set_dynamic(int dynamic_threads);}
1410 @end multitable
1411
1412 @item @emph{Fortran}:
1413 @multitable @columnfractions .20 .80
1414 @item @emph{Interface}: @tab @code{subroutine omp_set_dynamic(dynamic_threads)}
1415 @item                   @tab @code{logical, intent(in) :: dynamic_threads}
1416 @end multitable
1417
1418 @item @emph{See also}:
1419 @ref{OMP_DYNAMIC}, @ref{omp_get_dynamic}
1420
1421 @item @emph{Reference}:
1422 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.7.
1423 @end table
1424
1425
1426
1427 @node omp_set_max_active_levels
1428 @section @code{omp_set_max_active_levels} -- Limits the number of active parallel regions
1429 @table @asis
1430 @item @emph{Description}:
1431 This function limits the maximum allowed number of nested, active
1432 parallel regions.
1433
1434 @item @emph{C/C++}
1435 @multitable @columnfractions .20 .80
1436 @item @emph{Prototype}: @tab @code{void omp_set_max_active_levels(int max_levels);}
1437 @end multitable
1438
1439 @item @emph{Fortran}:
1440 @multitable @columnfractions .20 .80
1441 @item @emph{Interface}: @tab @code{subroutine omp_set_max_active_levels(max_levels)}
1442 @item                   @tab @code{integer max_levels}
1443 @end multitable
1444
1445 @item @emph{See also}:
1446 @ref{omp_get_max_active_levels}, @ref{omp_get_active_level}
1447
1448 @item @emph{Reference}:
1449 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.15.
1450 @end table
1451
1452
1453
1454 @node omp_set_nested
1455 @section @code{omp_set_nested} -- Enable/disable nested parallel regions
1456 @table @asis
1457 @item @emph{Description}:
1458 Enable or disable nested parallel regions, i.e., whether team members
1459 are allowed to create new teams.  The function takes the language-specific
1460 equivalent of @code{true} and @code{false}, where @code{true} enables
1461 dynamic adjustment of team sizes and @code{false} disables it.
1462
1463 @item @emph{C/C++}:
1464 @multitable @columnfractions .20 .80
1465 @item @emph{Prototype}: @tab @code{void omp_set_nested(int nested);}
1466 @end multitable
1467
1468 @item @emph{Fortran}:
1469 @multitable @columnfractions .20 .80
1470 @item @emph{Interface}: @tab @code{subroutine omp_set_nested(nested)}
1471 @item                   @tab @code{logical, intent(in) :: nested}
1472 @end multitable
1473
1474 @item @emph{See also}:
1475 @ref{OMP_NESTED}, @ref{omp_get_nested}
1476
1477 @item @emph{Reference}:
1478 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.10.
1479 @end table
1480
1481
1482
1483 @node omp_set_num_threads
1484 @section @code{omp_set_num_threads} -- Set upper team size limit
1485 @table @asis
1486 @item @emph{Description}:
1487 Specifies the number of threads used by default in subsequent parallel
1488 sections, if those do not specify a @code{num_threads} clause.  The
1489 argument of @code{omp_set_num_threads} shall be a positive integer.
1490
1491 @item @emph{C/C++}:
1492 @multitable @columnfractions .20 .80
1493 @item @emph{Prototype}: @tab @code{void omp_set_num_threads(int num_threads);}
1494 @end multitable
1495
1496 @item @emph{Fortran}:
1497 @multitable @columnfractions .20 .80
1498 @item @emph{Interface}: @tab @code{subroutine omp_set_num_threads(num_threads)}
1499 @item                   @tab @code{integer, intent(in) :: num_threads}
1500 @end multitable
1501
1502 @item @emph{See also}:
1503 @ref{OMP_NUM_THREADS}, @ref{omp_get_num_threads}, @ref{omp_get_max_threads}
1504
1505 @item @emph{Reference}:
1506 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.1.
1507 @end table
1508
1509
1510
1511 @node omp_set_schedule
1512 @section @code{omp_set_schedule} -- Set the runtime scheduling method
1513 @table @asis
1514 @item @emph{Description}:
1515 Sets the runtime scheduling method.  The @var{kind} argument can have the
1516 value @code{omp_sched_static}, @code{omp_sched_dynamic},
1517 @code{omp_sched_guided} or @code{omp_sched_auto}.  Except for
1518 @code{omp_sched_auto}, the chunk size is set to the value of
1519 @var{modifier} if positive, or to the default value if zero or negative.
1520 For @code{omp_sched_auto} the @var{modifier} argument is ignored.
1521
1522 @item @emph{C/C++}
1523 @multitable @columnfractions .20 .80
1524 @item @emph{Prototype}: @tab @code{void omp_set_schedule(omp_sched_t kind, int modifier);}
1525 @end multitable
1526
1527 @item @emph{Fortran}:
1528 @multitable @columnfractions .20 .80
1529 @item @emph{Interface}: @tab @code{subroutine omp_set_schedule(kind, modifier)}
1530 @item                   @tab @code{integer(kind=omp_sched_kind) kind}
1531 @item                   @tab @code{integer modifier}
1532 @end multitable
1533
1534 @item @emph{See also}:
1535 @ref{omp_get_schedule}
1536 @ref{OMP_SCHEDULE}
1537
1538 @item @emph{Reference}:
1539 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.2.12.
1540 @end table
1541
1542
1543
1544 @node omp_init_lock
1545 @section @code{omp_init_lock} -- Initialize simple lock
1546 @table @asis
1547 @item @emph{Description}:
1548 Initialize a simple lock.  After initialization, the lock is in
1549 an unlocked state.
1550
1551 @item @emph{C/C++}:
1552 @multitable @columnfractions .20 .80
1553 @item @emph{Prototype}: @tab @code{void omp_init_lock(omp_lock_t *lock);}
1554 @end multitable
1555
1556 @item @emph{Fortran}:
1557 @multitable @columnfractions .20 .80
1558 @item @emph{Interface}: @tab @code{subroutine omp_init_lock(svar)}
1559 @item                   @tab @code{integer(omp_lock_kind), intent(out) :: svar}
1560 @end multitable
1561
1562 @item @emph{See also}:
1563 @ref{omp_destroy_lock}
1564
1565 @item @emph{Reference}:
1566 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.1.
1567 @end table
1568
1569
1570
1571 @node omp_set_lock
1572 @section @code{omp_set_lock} -- Wait for and set simple lock
1573 @table @asis
1574 @item @emph{Description}:
1575 Before setting a simple lock, the lock variable must be initialized by
1576 @code{omp_init_lock}.  The calling thread is blocked until the lock
1577 is available.  If the lock is already held by the current thread,
1578 a deadlock occurs.
1579
1580 @item @emph{C/C++}:
1581 @multitable @columnfractions .20 .80
1582 @item @emph{Prototype}: @tab @code{void omp_set_lock(omp_lock_t *lock);}
1583 @end multitable
1584
1585 @item @emph{Fortran}:
1586 @multitable @columnfractions .20 .80
1587 @item @emph{Interface}: @tab @code{subroutine omp_set_lock(svar)}
1588 @item                   @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1589 @end multitable
1590
1591 @item @emph{See also}:
1592 @ref{omp_init_lock}, @ref{omp_test_lock}, @ref{omp_unset_lock}
1593
1594 @item @emph{Reference}:
1595 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.3.
1596 @end table
1597
1598
1599
1600 @node omp_test_lock
1601 @section @code{omp_test_lock} -- Test and set simple lock if available
1602 @table @asis
1603 @item @emph{Description}:
1604 Before setting a simple lock, the lock variable must be initialized by
1605 @code{omp_init_lock}.  Contrary to @code{omp_set_lock}, @code{omp_test_lock}
1606 does not block if the lock is not available.  This function returns
1607 @code{true} upon success, @code{false} otherwise.  Here, @code{true} and
1608 @code{false} represent their language-specific counterparts.
1609
1610 @item @emph{C/C++}:
1611 @multitable @columnfractions .20 .80
1612 @item @emph{Prototype}: @tab @code{int omp_test_lock(omp_lock_t *lock);}
1613 @end multitable
1614
1615 @item @emph{Fortran}:
1616 @multitable @columnfractions .20 .80
1617 @item @emph{Interface}: @tab @code{logical function omp_test_lock(svar)}
1618 @item                   @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1619 @end multitable
1620
1621 @item @emph{See also}:
1622 @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1623
1624 @item @emph{Reference}:
1625 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.5.
1626 @end table
1627
1628
1629
1630 @node omp_unset_lock
1631 @section @code{omp_unset_lock} -- Unset simple lock
1632 @table @asis
1633 @item @emph{Description}:
1634 A simple lock about to be unset must have been locked by @code{omp_set_lock}
1635 or @code{omp_test_lock} before.  In addition, the lock must be held by the
1636 thread calling @code{omp_unset_lock}.  Then, the lock becomes unlocked.  If one
1637 or more threads attempted to set the lock before, one of them is chosen to,
1638 again, set the lock to itself.
1639
1640 @item @emph{C/C++}:
1641 @multitable @columnfractions .20 .80
1642 @item @emph{Prototype}: @tab @code{void omp_unset_lock(omp_lock_t *lock);}
1643 @end multitable
1644
1645 @item @emph{Fortran}:
1646 @multitable @columnfractions .20 .80
1647 @item @emph{Interface}: @tab @code{subroutine omp_unset_lock(svar)}
1648 @item                   @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1649 @end multitable
1650
1651 @item @emph{See also}:
1652 @ref{omp_set_lock}, @ref{omp_test_lock}
1653
1654 @item @emph{Reference}:
1655 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.4.
1656 @end table
1657
1658
1659
1660 @node omp_destroy_lock
1661 @section @code{omp_destroy_lock} -- Destroy simple lock
1662 @table @asis
1663 @item @emph{Description}:
1664 Destroy a simple lock.  In order to be destroyed, a simple lock must be
1665 in the unlocked state.
1666
1667 @item @emph{C/C++}:
1668 @multitable @columnfractions .20 .80
1669 @item @emph{Prototype}: @tab @code{void omp_destroy_lock(omp_lock_t *lock);}
1670 @end multitable
1671
1672 @item @emph{Fortran}:
1673 @multitable @columnfractions .20 .80
1674 @item @emph{Interface}: @tab @code{subroutine omp_destroy_lock(svar)}
1675 @item                   @tab @code{integer(omp_lock_kind), intent(inout) :: svar}
1676 @end multitable
1677
1678 @item @emph{See also}:
1679 @ref{omp_init_lock}
1680
1681 @item @emph{Reference}:
1682 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.2.
1683 @end table
1684
1685
1686
1687 @node omp_init_nest_lock
1688 @section @code{omp_init_nest_lock} -- Initialize nested lock
1689 @table @asis
1690 @item @emph{Description}:
1691 Initialize a nested lock.  After initialization, the lock is in
1692 an unlocked state and the nesting count is set to zero.
1693
1694 @item @emph{C/C++}:
1695 @multitable @columnfractions .20 .80
1696 @item @emph{Prototype}: @tab @code{void omp_init_nest_lock(omp_nest_lock_t *lock);}
1697 @end multitable
1698
1699 @item @emph{Fortran}:
1700 @multitable @columnfractions .20 .80
1701 @item @emph{Interface}: @tab @code{subroutine omp_init_nest_lock(nvar)}
1702 @item                   @tab @code{integer(omp_nest_lock_kind), intent(out) :: nvar}
1703 @end multitable
1704
1705 @item @emph{See also}:
1706 @ref{omp_destroy_nest_lock}
1707
1708 @item @emph{Reference}:
1709 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.1.
1710 @end table
1711
1712
1713 @node omp_set_nest_lock
1714 @section @code{omp_set_nest_lock} -- Wait for and set nested lock
1715 @table @asis
1716 @item @emph{Description}:
1717 Before setting a nested lock, the lock variable must be initialized by
1718 @code{omp_init_nest_lock}.  The calling thread is blocked until the lock
1719 is available.  If the lock is already held by the current thread, the
1720 nesting count for the lock is incremented.
1721
1722 @item @emph{C/C++}:
1723 @multitable @columnfractions .20 .80
1724 @item @emph{Prototype}: @tab @code{void omp_set_nest_lock(omp_nest_lock_t *lock);}
1725 @end multitable
1726
1727 @item @emph{Fortran}:
1728 @multitable @columnfractions .20 .80
1729 @item @emph{Interface}: @tab @code{subroutine omp_set_nest_lock(nvar)}
1730 @item                   @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1731 @end multitable
1732
1733 @item @emph{See also}:
1734 @ref{omp_init_nest_lock}, @ref{omp_unset_nest_lock}
1735
1736 @item @emph{Reference}:
1737 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.3.
1738 @end table
1739
1740
1741
1742 @node omp_test_nest_lock
1743 @section @code{omp_test_nest_lock} -- Test and set nested lock if available
1744 @table @asis
1745 @item @emph{Description}:
1746 Before setting a nested lock, the lock variable must be initialized by
1747 @code{omp_init_nest_lock}.  Contrary to @code{omp_set_nest_lock},
1748 @code{omp_test_nest_lock} does not block if the lock is not available.
1749 If the lock is already held by the current thread, the new nesting count
1750 is returned.  Otherwise, the return value equals zero.
1751
1752 @item @emph{C/C++}:
1753 @multitable @columnfractions .20 .80
1754 @item @emph{Prototype}: @tab @code{int omp_test_nest_lock(omp_nest_lock_t *lock);}
1755 @end multitable
1756
1757 @item @emph{Fortran}:
1758 @multitable @columnfractions .20 .80
1759 @item @emph{Interface}: @tab @code{logical function omp_test_nest_lock(nvar)}
1760 @item                   @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1761 @end multitable
1762
1763
1764 @item @emph{See also}:
1765 @ref{omp_init_lock}, @ref{omp_set_lock}, @ref{omp_set_lock}
1766
1767 @item @emph{Reference}:
1768 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.5.
1769 @end table
1770
1771
1772
1773 @node omp_unset_nest_lock
1774 @section @code{omp_unset_nest_lock} -- Unset nested lock
1775 @table @asis
1776 @item @emph{Description}:
1777 A nested lock about to be unset must have been locked by @code{omp_set_nested_lock}
1778 or @code{omp_test_nested_lock} before.  In addition, the lock must be held by the
1779 thread calling @code{omp_unset_nested_lock}.  If the nesting count drops to zero, the
1780 lock becomes unlocked.  If one ore more threads attempted to set the lock before,
1781 one of them is chosen to, again, set the lock to itself.
1782
1783 @item @emph{C/C++}:
1784 @multitable @columnfractions .20 .80
1785 @item @emph{Prototype}: @tab @code{void omp_unset_nest_lock(omp_nest_lock_t *lock);}
1786 @end multitable
1787
1788 @item @emph{Fortran}:
1789 @multitable @columnfractions .20 .80
1790 @item @emph{Interface}: @tab @code{subroutine omp_unset_nest_lock(nvar)}
1791 @item                   @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1792 @end multitable
1793
1794 @item @emph{See also}:
1795 @ref{omp_set_nest_lock}
1796
1797 @item @emph{Reference}:
1798 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.4.
1799 @end table
1800
1801
1802
1803 @node omp_destroy_nest_lock
1804 @section @code{omp_destroy_nest_lock} -- Destroy nested lock
1805 @table @asis
1806 @item @emph{Description}:
1807 Destroy a nested lock.  In order to be destroyed, a nested lock must be
1808 in the unlocked state and its nesting count must equal zero.
1809
1810 @item @emph{C/C++}:
1811 @multitable @columnfractions .20 .80
1812 @item @emph{Prototype}: @tab @code{void omp_destroy_nest_lock(omp_nest_lock_t *);}
1813 @end multitable
1814
1815 @item @emph{Fortran}:
1816 @multitable @columnfractions .20 .80
1817 @item @emph{Interface}: @tab @code{subroutine omp_destroy_nest_lock(nvar)}
1818 @item                   @tab @code{integer(omp_nest_lock_kind), intent(inout) :: nvar}
1819 @end multitable
1820
1821 @item @emph{See also}:
1822 @ref{omp_init_lock}
1823
1824 @item @emph{Reference}:
1825 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.3.2.
1826 @end table
1827
1828
1829
1830 @node omp_get_wtick
1831 @section @code{omp_get_wtick} -- Get timer precision
1832 @table @asis
1833 @item @emph{Description}:
1834 Gets the timer precision, i.e., the number of seconds between two
1835 successive clock ticks.
1836
1837 @item @emph{C/C++}:
1838 @multitable @columnfractions .20 .80
1839 @item @emph{Prototype}: @tab @code{double omp_get_wtick(void);}
1840 @end multitable
1841
1842 @item @emph{Fortran}:
1843 @multitable @columnfractions .20 .80
1844 @item @emph{Interface}: @tab @code{double precision function omp_get_wtick()}
1845 @end multitable
1846
1847 @item @emph{See also}:
1848 @ref{omp_get_wtime}
1849
1850 @item @emph{Reference}:
1851 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.4.2.
1852 @end table
1853
1854
1855
1856 @node omp_get_wtime
1857 @section @code{omp_get_wtime} -- Elapsed wall clock time
1858 @table @asis
1859 @item @emph{Description}:
1860 Elapsed wall clock time in seconds.  The time is measured per thread, no
1861 guarantee can be made that two distinct threads measure the same time.
1862 Time is measured from some "time in the past", which is an arbitrary time
1863 guaranteed not to change during the execution of the program.
1864
1865 @item @emph{C/C++}:
1866 @multitable @columnfractions .20 .80
1867 @item @emph{Prototype}: @tab @code{double omp_get_wtime(void);}
1868 @end multitable
1869
1870 @item @emph{Fortran}:
1871 @multitable @columnfractions .20 .80
1872 @item @emph{Interface}: @tab @code{double precision function omp_get_wtime()}
1873 @end multitable
1874
1875 @item @emph{See also}:
1876 @ref{omp_get_wtick}
1877
1878 @item @emph{Reference}:
1879 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 3.4.1.
1880 @end table
1881
1882
1883
1884 @c ---------------------------------------------------------------------
1885 @c OpenMP Environment Variables
1886 @c ---------------------------------------------------------------------
1887
1888 @node Environment Variables
1889 @chapter OpenMP Environment Variables
1890
1891 The environment variables which beginning with @env{OMP_} are defined by
1892 section 4 of the OpenMP specification in version 4.0, while those
1893 beginning with @env{GOMP_} are GNU extensions.
1894
1895 @menu
1896 * OMP_CANCELLATION::      Set whether cancellation is activated
1897 * OMP_DISPLAY_ENV::       Show OpenMP version and environment variables
1898 * OMP_DEFAULT_DEVICE::    Set the device used in target regions
1899 * OMP_DYNAMIC::           Dynamic adjustment of threads
1900 * OMP_MAX_ACTIVE_LEVELS:: Set the maximum number of nested parallel regions
1901 * OMP_NESTED::            Nested parallel regions
1902 * OMP_NUM_THREADS::       Specifies the number of threads to use
1903 * OMP_PROC_BIND::         Whether theads may be moved between CPUs
1904 * OMP_PLACES::            Specifies on which CPUs the theads should be placed
1905 * OMP_STACKSIZE::         Set default thread stack size
1906 * OMP_SCHEDULE::          How threads are scheduled
1907 * OMP_THREAD_LIMIT::      Set the maximum number of threads
1908 * OMP_WAIT_POLICY::       How waiting threads are handled
1909 * GOMP_CPU_AFFINITY::     Bind threads to specific CPUs
1910 * GOMP_STACKSIZE::        Set default thread stack size
1911 * GOMP_SPINCOUNT::        Set the busy-wait spin count
1912 @end menu
1913
1914
1915 @node OMP_CANCELLATION
1916 @section @env{OMP_CANCELLATION} -- Set whether cancellation is activated
1917 @cindex Environment Variable
1918 @table @asis
1919 @item @emph{Description}:
1920 If set to @code{TRUE}, the cancellation is activated.  If set to @code{FALSE} or
1921 if unset, cancellation is disabled and the @code{cancel} construct is ignored.
1922
1923 @item @emph{See also}:
1924 @ref{omp_get_cancellation}
1925
1926 @item @emph{Reference}:
1927 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.11
1928 @end table
1929
1930
1931
1932 @node OMP_DISPLAY_ENV
1933 @section @env{OMP_DISPLAY_ENV} -- Show OpenMP version and environment variables
1934 @cindex Environment Variable
1935 @table @asis
1936 @item @emph{Description}:
1937 If set to @code{TRUE}, the OpenMP version number and the values
1938 associated with the OpenMP environment variables are printed to @code{stderr}.
1939 If set to @code{VERBOSE}, it additionally shows the value of the environment
1940 variables which are GNU extensions.  If undefined or set to @code{FALSE},
1941 this information will not be shown.
1942
1943
1944 @item @emph{Reference}:
1945 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.12
1946 @end table
1947
1948
1949
1950 @node OMP_DEFAULT_DEVICE
1951 @section @env{OMP_DEFAULT_DEVICE} -- Set the device used in target regions
1952 @cindex Environment Variable
1953 @table @asis
1954 @item @emph{Description}:
1955 Set to choose the device which is used in a @code{target} region, unless the
1956 value is overridden by @code{omp_set_default_device} or by a @code{device}
1957 clause.  The value shall be the nonnegative device number. If no device with
1958 the given device number exists, the code is executed on the host.  If unset,
1959 device number 0 will be used.
1960
1961
1962 @item @emph{See also}:
1963 @ref{omp_get_default_device}, @ref{omp_set_default_device},
1964
1965 @item @emph{Reference}:
1966 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.11
1967 @end table
1968
1969
1970
1971 @node OMP_DYNAMIC
1972 @section @env{OMP_DYNAMIC} -- Dynamic adjustment of threads
1973 @cindex Environment Variable
1974 @table @asis
1975 @item @emph{Description}:
1976 Enable or disable the dynamic adjustment of the number of threads
1977 within a team.  The value of this environment variable shall be
1978 @code{TRUE} or @code{FALSE}.  If undefined, dynamic adjustment is
1979 disabled by default.
1980
1981 @item @emph{See also}:
1982 @ref{omp_set_dynamic}
1983
1984 @item @emph{Reference}:
1985 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.3
1986 @end table
1987
1988
1989
1990 @node OMP_MAX_ACTIVE_LEVELS
1991 @section @env{OMP_MAX_ACTIVE_LEVELS} -- Set the maximum number of nested parallel regions
1992 @cindex Environment Variable
1993 @table @asis
1994 @item @emph{Description}:
1995 Specifies the initial value for the maximum number of nested parallel
1996 regions.  The value of this variable shall be a positive integer.
1997 If undefined, the number of active levels is unlimited.
1998
1999 @item @emph{See also}:
2000 @ref{omp_set_max_active_levels}
2001
2002 @item @emph{Reference}:
2003 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.9
2004 @end table
2005
2006
2007
2008 @node OMP_NESTED
2009 @section @env{OMP_NESTED} -- Nested parallel regions
2010 @cindex Environment Variable
2011 @cindex Implementation specific setting
2012 @table @asis
2013 @item @emph{Description}:
2014 Enable or disable nested parallel regions, i.e., whether team members
2015 are allowed to create new teams.  The value of this environment variable
2016 shall be @code{TRUE} or @code{FALSE}.  If undefined, nested parallel
2017 regions are disabled by default.
2018
2019 @item @emph{See also}:
2020 @ref{omp_set_nested}
2021
2022 @item @emph{Reference}:
2023 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.6
2024 @end table
2025
2026
2027
2028 @node OMP_NUM_THREADS
2029 @section @env{OMP_NUM_THREADS} -- Specifies the number of threads to use
2030 @cindex Environment Variable
2031 @cindex Implementation specific setting
2032 @table @asis
2033 @item @emph{Description}:
2034 Specifies the default number of threads to use in parallel regions.  The
2035 value of this variable shall be a comma-separated list of positive integers;
2036 the value specified the number of threads to use for the corresponding nested
2037 level.  If undefined one thread per CPU is used.
2038
2039 @item @emph{See also}:
2040 @ref{omp_set_num_threads}
2041
2042 @item @emph{Reference}:
2043 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.2
2044 @end table
2045
2046
2047
2048 @node OMP_PROC_BIND
2049 @section @env{OMP_PROC_BIND} -- Whether theads may be moved between CPUs
2050 @cindex Environment Variable
2051 @table @asis
2052 @item @emph{Description}:
2053 Specifies whether threads may be moved between processors.  If set to
2054 @code{TRUE}, OpenMP theads should not be moved; if set to @code{FALSE}
2055 they may be moved.  Alternatively, a comma separated list with the
2056 values @code{MASTER}, @code{CLOSE} and @code{SPREAD} can be used to specify
2057 the thread affinity policy for the corresponding nesting level.  With
2058 @code{MASTER} the worker threads are in the same place partition as the
2059 master thread.  With @code{CLOSE} those are kept close to the master thread
2060 in contiguous place partitions.  And with @code{SPREAD} a sparse distribution
2061 across the place partitions is used.
2062
2063 When undefined, @env{OMP_PROC_BIND} defaults to @code{TRUE} when
2064 @env{OMP_PLACES} or @env{GOMP_CPU_AFFINITY} is set and @code{FALSE} otherwise.
2065
2066 @item @emph{See also}:
2067 @ref{OMP_PLACES}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind}
2068
2069 @item @emph{Reference}:
2070 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.4
2071 @end table
2072
2073
2074
2075 @node OMP_PLACES
2076 @section @env{OMP_PLACES} -- Specifies on which CPUs the theads should be placed
2077 @cindex Environment Variable
2078 @table @asis
2079 @item @emph{Description}:
2080 The thread placement can be either specified using an abstract name or by an
2081 explicit list of the places.  The abstract names @code{threads}, @code{cores}
2082 and @code{sockets} can be optionally followed by a positive number in
2083 parentheses, which denotes the how many places shall be created.  With
2084 @code{threads} each place corresponds to a single hardware thread; @code{cores}
2085 to a single core with the corresponding number of hardware threads; and with
2086 @code{sockets} the place corresponds to a single socket.  The resulting
2087 placement can be shown by setting the @env{OMP_DISPLAY_ENV} environment
2088 variable.
2089
2090 Alternatively, the placement can be specified explicitly as comma-separated
2091 list of places.  A place is specified by set of nonnegative numbers in curly
2092 braces, denoting the denoting the hardware threads.  The hardware threads
2093 belonging to a place can either be specified as comma-separated list of
2094 nonnegative thread numbers or using an interval.  Multiple places can also be
2095 either specified by a comma-separated list of places or by an interval.  To
2096 specify an interval, a colon followed by the count is placed after after
2097 the hardware thread number or the place.  Optionally, the length can be
2098 followed by a colon and the stride number -- otherwise a unit stride is
2099 assumed.  For instance, the following specifies the same places list:
2100 @code{"@{0,1,2@}, @{3,4,6@}, @{7,8,9@}, @{10,11,12@}"};
2101 @code{"@{0:3@}, @{3:3@}, @{7:3@}, @{10:3@}"}; and @code{"@{0:2@}:4:3"}.
2102
2103 If @env{OMP_PLACES} and @env{GOMP_CPU_AFFINITY} are unset and
2104 @env{OMP_PROC_BIND} is either unset or @code{false}, threads may be moved
2105 between CPUs following no placement policy.
2106
2107 @item @emph{See also}:
2108 @ref{OMP_PROC_BIND}, @ref{GOMP_CPU_AFFINITY}, @ref{omp_get_proc_bind},
2109 @ref{OMP_DISPLAY_ENV}
2110
2111 @item @emph{Reference}:
2112 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.5
2113 @end table
2114
2115
2116
2117 @node OMP_STACKSIZE
2118 @section @env{OMP_STACKSIZE} -- Set default thread stack size
2119 @cindex Environment Variable
2120 @table @asis
2121 @item @emph{Description}:
2122 Set the default thread stack size in kilobytes, unless the number
2123 is suffixed by @code{B}, @code{K}, @code{M} or @code{G}, in which
2124 case the size is, respectively, in bytes, kilobytes, megabytes
2125 or gigabytes.  This is different from @code{pthread_attr_setstacksize}
2126 which gets the number of bytes as an argument.  If the stack size cannot
2127 be set due to system constraints, an error is reported and the initial
2128 stack size is left unchanged.  If undefined, the stack size is system
2129 dependent.
2130
2131 @item @emph{Reference}:
2132 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.7
2133 @end table
2134
2135
2136
2137 @node OMP_SCHEDULE
2138 @section @env{OMP_SCHEDULE} -- How threads are scheduled
2139 @cindex Environment Variable
2140 @cindex Implementation specific setting
2141 @table @asis
2142 @item @emph{Description}:
2143 Allows to specify @code{schedule type} and @code{chunk size}.
2144 The value of the variable shall have the form: @code{type[,chunk]} where
2145 @code{type} is one of @code{static}, @code{dynamic}, @code{guided} or @code{auto}
2146 The optional @code{chunk} size shall be a positive integer.  If undefined,
2147 dynamic scheduling and a chunk size of 1 is used.
2148
2149 @item @emph{See also}:
2150 @ref{omp_set_schedule}
2151
2152 @item @emph{Reference}:
2153 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Sections 2.7.1 and 4.1
2154 @end table
2155
2156
2157
2158 @node OMP_THREAD_LIMIT
2159 @section @env{OMP_THREAD_LIMIT} -- Set the maximum number of threads
2160 @cindex Environment Variable
2161 @table @asis
2162 @item @emph{Description}:
2163 Specifies the number of threads to use for the whole program.  The
2164 value of this variable shall be a positive integer.  If undefined,
2165 the number of threads is not limited.
2166
2167 @item @emph{See also}:
2168 @ref{OMP_NUM_THREADS}, @ref{omp_get_thread_limit}
2169
2170 @item @emph{Reference}:
2171 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.10
2172 @end table
2173
2174
2175
2176 @node OMP_WAIT_POLICY
2177 @section @env{OMP_WAIT_POLICY} -- How waiting threads are handled
2178 @cindex Environment Variable
2179 @table @asis
2180 @item @emph{Description}:
2181 Specifies whether waiting threads should be active or passive.  If
2182 the value is @code{PASSIVE}, waiting threads should not consume CPU
2183 power while waiting; while the value is @code{ACTIVE} specifies that
2184 they should.  If undefined, threads wait actively for a short time
2185 before waiting passively.
2186
2187 @item @emph{See also}:
2188 @ref{GOMP_SPINCOUNT}
2189
2190 @item @emph{Reference}:
2191 @uref{http://www.openmp.org/, OpenMP specification v4.0}, Section 4.8
2192 @end table
2193
2194
2195
2196 @node GOMP_CPU_AFFINITY
2197 @section @env{GOMP_CPU_AFFINITY} -- Bind threads to specific CPUs
2198 @cindex Environment Variable
2199 @table @asis
2200 @item @emph{Description}:
2201 Binds threads to specific CPUs.  The variable should contain a space-separated
2202 or comma-separated list of CPUs.  This list may contain different kinds of
2203 entries: either single CPU numbers in any order, a range of CPUs (M-N)
2204 or a range with some stride (M-N:S).  CPU numbers are zero based.  For example,
2205 @code{GOMP_CPU_AFFINITY="0 3 1-2 4-15:2"} will bind the initial thread
2206 to CPU 0, the second to CPU 3, the third to CPU 1, the fourth to
2207 CPU 2, the fifth to CPU 4, the sixth through tenth to CPUs 6, 8, 10, 12,
2208 and 14 respectively and then start assigning back from the beginning of
2209 the list.  @code{GOMP_CPU_AFFINITY=0} binds all threads to CPU 0.
2210
2211 There is no GNU OpenMP library routine to determine whether a CPU affinity
2212 specification is in effect.  As a workaround, language-specific library
2213 functions, e.g., @code{getenv} in C or @code{GET_ENVIRONMENT_VARIABLE} in
2214 Fortran, may be used to query the setting of the @code{GOMP_CPU_AFFINITY}
2215 environment variable.  A defined CPU affinity on startup cannot be changed
2216 or disabled during the runtime of the application.
2217
2218 If both @env{GOMP_CPU_AFFINITY} and @env{OMP_PROC_BIND} are set,
2219 @env{OMP_PROC_BIND} has a higher precedence.  If neither has been set and
2220 @env{OMP_PROC_BIND} is unset, or when @env{OMP_PROC_BIND} is set to
2221 @code{FALSE}, the host system will handle the assignment of threads to CPUs.
2222
2223 @item @emph{See also}:
2224 @ref{OMP_PLACES}, @ref{OMP_PROC_BIND}
2225 @end table
2226
2227
2228
2229 @node GOMP_STACKSIZE
2230 @section @env{GOMP_STACKSIZE} -- Set default thread stack size
2231 @cindex Environment Variable
2232 @cindex Implementation specific setting
2233 @table @asis
2234 @item @emph{Description}:
2235 Set the default thread stack size in kilobytes.  This is different from
2236 @code{pthread_attr_setstacksize} which gets the number of bytes as an
2237 argument.  If the stack size cannot be set due to system constraints, an
2238 error is reported and the initial stack size is left unchanged.  If undefined,
2239 the stack size is system dependent.
2240
2241 @item @emph{See also}:
2242 @ref{OMP_STACKSIZE}
2243
2244 @item @emph{Reference}:
2245 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00493.html,
2246 GCC Patches Mailinglist},
2247 @uref{http://gcc.gnu.org/ml/gcc-patches/2006-06/msg00496.html,
2248 GCC Patches Mailinglist}
2249 @end table
2250
2251
2252
2253 @node GOMP_SPINCOUNT
2254 @section @env{GOMP_SPINCOUNT} -- Set the busy-wait spin count
2255 @cindex Environment Variable
2256 @cindex Implementation specific setting
2257 @table @asis
2258 @item @emph{Description}:
2259 Determines how long a threads waits actively with consuming CPU power
2260 before waiting passively without consuming CPU power.  The value may be
2261 either @code{INFINITE}, @code{INFINITY} to always wait actively or an
2262 integer which gives the number of spins of the busy-wait loop.  The
2263 integer may optionally be followed by the following suffixes acting
2264 as multiplication factors: @code{k} (kilo, thousand), @code{M} (mega,
2265 million), @code{G} (giga, billion), or @code{T} (tera, trillion).
2266 If undefined, 0 is used when @env{OMP_WAIT_POLICY} is @code{PASSIVE},
2267 300,000 is used when @env{OMP_WAIT_POLICY} is undefined and
2268 30 billion is used when @env{OMP_WAIT_POLICY} is @code{ACTIVE}.
2269 If there are more OpenMP threads than available CPUs, 1000 and 100
2270 spins are used for @env{OMP_WAIT_POLICY} being @code{ACTIVE} or
2271 undefined, respectively; unless the @env{GOMP_SPINCOUNT} is lower
2272 or @env{OMP_WAIT_POLICY} is @code{PASSIVE}.
2273
2274 @item @emph{See also}:
2275 @ref{OMP_WAIT_POLICY}
2276 @end table
2277
2278
2279
2280 @c ---------------------------------------------------------------------
2281 @c The libgomp ABI
2282 @c ---------------------------------------------------------------------
2283
2284 @node The libgomp ABI
2285 @chapter The libgomp ABI
2286
2287 The following sections present notes on the external ABI as
2288 presented by libgomp.  Only maintainers should need them.
2289
2290 @menu
2291 * Implementing MASTER construct::
2292 * Implementing CRITICAL construct::
2293 * Implementing ATOMIC construct::
2294 * Implementing FLUSH construct::
2295 * Implementing BARRIER construct::
2296 * Implementing THREADPRIVATE construct::
2297 * Implementing PRIVATE clause::
2298 * Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses::
2299 * Implementing REDUCTION clause::
2300 * Implementing PARALLEL construct::
2301 * Implementing FOR construct::
2302 * Implementing ORDERED construct::
2303 * Implementing SECTIONS construct::
2304 * Implementing SINGLE construct::
2305 * Implementing OpenACC's PARALLEL construct::
2306 @end menu
2307
2308
2309 @node Implementing MASTER construct
2310 @section Implementing MASTER construct
2311
2312 @smallexample
2313 if (omp_get_thread_num () == 0)
2314   block
2315 @end smallexample
2316
2317 Alternately, we generate two copies of the parallel subfunction
2318 and only include this in the version run by the master thread.
2319 Surely this is not worthwhile though...
2320
2321
2322
2323 @node Implementing CRITICAL construct
2324 @section Implementing CRITICAL construct
2325
2326 Without a specified name,
2327
2328 @smallexample
2329   void GOMP_critical_start (void);
2330   void GOMP_critical_end (void);
2331 @end smallexample
2332
2333 so that we don't get COPY relocations from libgomp to the main
2334 application.
2335
2336 With a specified name, use omp_set_lock and omp_unset_lock with
2337 name being transformed into a variable declared like
2338
2339 @smallexample
2340   omp_lock_t gomp_critical_user_<name> __attribute__((common))
2341 @end smallexample
2342
2343 Ideally the ABI would specify that all zero is a valid unlocked
2344 state, and so we wouldn't need to initialize this at
2345 startup.
2346
2347
2348
2349 @node Implementing ATOMIC construct
2350 @section Implementing ATOMIC construct
2351
2352 The target should implement the @code{__sync} builtins.
2353
2354 Failing that we could add
2355
2356 @smallexample
2357   void GOMP_atomic_enter (void)
2358   void GOMP_atomic_exit (void)
2359 @end smallexample
2360
2361 which reuses the regular lock code, but with yet another lock
2362 object private to the library.
2363
2364
2365
2366 @node Implementing FLUSH construct
2367 @section Implementing FLUSH construct
2368
2369 Expands to the @code{__sync_synchronize} builtin.
2370
2371
2372
2373 @node Implementing BARRIER construct
2374 @section Implementing BARRIER construct
2375
2376 @smallexample
2377   void GOMP_barrier (void)
2378 @end smallexample
2379
2380
2381 @node Implementing THREADPRIVATE construct
2382 @section Implementing THREADPRIVATE construct
2383
2384 In _most_ cases we can map this directly to @code{__thread}.  Except
2385 that OMP allows constructors for C++ objects.  We can either
2386 refuse to support this (how often is it used?) or we can
2387 implement something akin to .ctors.
2388
2389 Even more ideally, this ctor feature is handled by extensions
2390 to the main pthreads library.  Failing that, we can have a set
2391 of entry points to register ctor functions to be called.
2392
2393
2394
2395 @node Implementing PRIVATE clause
2396 @section Implementing PRIVATE clause
2397
2398 In association with a PARALLEL, or within the lexical extent
2399 of a PARALLEL block, the variable becomes a local variable in
2400 the parallel subfunction.
2401
2402 In association with FOR or SECTIONS blocks, create a new
2403 automatic variable within the current function.  This preserves
2404 the semantic of new variable creation.
2405
2406
2407
2408 @node Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
2409 @section Implementing FIRSTPRIVATE LASTPRIVATE COPYIN and COPYPRIVATE clauses
2410
2411 This seems simple enough for PARALLEL blocks.  Create a private
2412 struct for communicating between the parent and subfunction.
2413 In the parent, copy in values for scalar and "small" structs;
2414 copy in addresses for others TREE_ADDRESSABLE types.  In the
2415 subfunction, copy the value into the local variable.
2416
2417 It is not clear what to do with bare FOR or SECTION blocks.
2418 The only thing I can figure is that we do something like:
2419
2420 @smallexample
2421 #pragma omp for firstprivate(x) lastprivate(y)
2422 for (int i = 0; i < n; ++i)
2423   body;
2424 @end smallexample
2425
2426 which becomes
2427
2428 @smallexample
2429 @{
2430   int x = x, y;
2431
2432   // for stuff
2433
2434   if (i == n)
2435     y = y;
2436 @}
2437 @end smallexample
2438
2439 where the "x=x" and "y=y" assignments actually have different
2440 uids for the two variables, i.e. not something you could write
2441 directly in C.  Presumably this only makes sense if the "outer"
2442 x and y are global variables.
2443
2444 COPYPRIVATE would work the same way, except the structure
2445 broadcast would have to happen via SINGLE machinery instead.
2446
2447
2448
2449 @node Implementing REDUCTION clause
2450 @section Implementing REDUCTION clause
2451
2452 The private struct mentioned in the previous section should have
2453 a pointer to an array of the type of the variable, indexed by the
2454 thread's @var{team_id}.  The thread stores its final value into the
2455 array, and after the barrier, the master thread iterates over the
2456 array to collect the values.
2457
2458
2459 @node Implementing PARALLEL construct
2460 @section Implementing PARALLEL construct
2461
2462 @smallexample
2463   #pragma omp parallel
2464   @{
2465     body;
2466   @}
2467 @end smallexample
2468
2469 becomes
2470
2471 @smallexample
2472   void subfunction (void *data)
2473   @{
2474     use data;
2475     body;
2476   @}
2477
2478   setup data;
2479   GOMP_parallel_start (subfunction, &data, num_threads);
2480   subfunction (&data);
2481   GOMP_parallel_end ();
2482 @end smallexample
2483
2484 @smallexample
2485   void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads)
2486 @end smallexample
2487
2488 The @var{FN} argument is the subfunction to be run in parallel.
2489
2490 The @var{DATA} argument is a pointer to a structure used to
2491 communicate data in and out of the subfunction, as discussed
2492 above with respect to FIRSTPRIVATE et al.
2493
2494 The @var{NUM_THREADS} argument is 1 if an IF clause is present
2495 and false, or the value of the NUM_THREADS clause, if
2496 present, or 0.
2497
2498 The function needs to create the appropriate number of
2499 threads and/or launch them from the dock.  It needs to
2500 create the team structure and assign team ids.
2501
2502 @smallexample
2503   void GOMP_parallel_end (void)
2504 @end smallexample
2505
2506 Tears down the team and returns us to the previous @code{omp_in_parallel()} state.
2507
2508
2509
2510 @node Implementing FOR construct
2511 @section Implementing FOR construct
2512
2513 @smallexample
2514   #pragma omp parallel for
2515   for (i = lb; i <= ub; i++)
2516     body;
2517 @end smallexample
2518
2519 becomes
2520
2521 @smallexample
2522   void subfunction (void *data)
2523   @{
2524     long _s0, _e0;
2525     while (GOMP_loop_static_next (&_s0, &_e0))
2526     @{
2527       long _e1 = _e0, i;
2528       for (i = _s0; i < _e1; i++)
2529         body;
2530     @}
2531     GOMP_loop_end_nowait ();
2532   @}
2533
2534   GOMP_parallel_loop_static (subfunction, NULL, 0, lb, ub+1, 1, 0);
2535   subfunction (NULL);
2536   GOMP_parallel_end ();
2537 @end smallexample
2538
2539 @smallexample
2540   #pragma omp for schedule(runtime)
2541   for (i = 0; i < n; i++)
2542     body;
2543 @end smallexample
2544
2545 becomes
2546
2547 @smallexample
2548   @{
2549     long i, _s0, _e0;
2550     if (GOMP_loop_runtime_start (0, n, 1, &_s0, &_e0))
2551       do @{
2552         long _e1 = _e0;
2553         for (i = _s0, i < _e0; i++)
2554           body;
2555       @} while (GOMP_loop_runtime_next (&_s0, _&e0));
2556     GOMP_loop_end ();
2557   @}
2558 @end smallexample
2559
2560 Note that while it looks like there is trickiness to propagating
2561 a non-constant STEP, there isn't really.  We're explicitly allowed
2562 to evaluate it as many times as we want, and any variables involved
2563 should automatically be handled as PRIVATE or SHARED like any other
2564 variables.  So the expression should remain evaluable in the
2565 subfunction.  We can also pull it into a local variable if we like,
2566 but since its supposed to remain unchanged, we can also not if we like.
2567
2568 If we have SCHEDULE(STATIC), and no ORDERED, then we ought to be
2569 able to get away with no work-sharing context at all, since we can
2570 simply perform the arithmetic directly in each thread to divide up
2571 the iterations.  Which would mean that we wouldn't need to call any
2572 of these routines.
2573
2574 There are separate routines for handling loops with an ORDERED
2575 clause.  Bookkeeping for that is non-trivial...
2576
2577
2578
2579 @node Implementing ORDERED construct
2580 @section Implementing ORDERED construct
2581
2582 @smallexample
2583   void GOMP_ordered_start (void)
2584   void GOMP_ordered_end (void)
2585 @end smallexample
2586
2587
2588
2589 @node Implementing SECTIONS construct
2590 @section Implementing SECTIONS construct
2591
2592 A block as
2593
2594 @smallexample
2595   #pragma omp sections
2596   @{
2597     #pragma omp section
2598     stmt1;
2599     #pragma omp section
2600     stmt2;
2601     #pragma omp section
2602     stmt3;
2603   @}
2604 @end smallexample
2605
2606 becomes
2607
2608 @smallexample
2609   for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
2610     switch (i)
2611       @{
2612       case 1:
2613         stmt1;
2614         break;
2615       case 2:
2616         stmt2;
2617         break;
2618       case 3:
2619         stmt3;
2620         break;
2621       @}
2622   GOMP_barrier ();
2623 @end smallexample
2624
2625
2626 @node Implementing SINGLE construct
2627 @section Implementing SINGLE construct
2628
2629 A block like
2630
2631 @smallexample
2632   #pragma omp single
2633   @{
2634     body;
2635   @}
2636 @end smallexample
2637
2638 becomes
2639
2640 @smallexample
2641   if (GOMP_single_start ())
2642     body;
2643   GOMP_barrier ();
2644 @end smallexample
2645
2646 while
2647
2648 @smallexample
2649   #pragma omp single copyprivate(x)
2650     body;
2651 @end smallexample
2652
2653 becomes
2654
2655 @smallexample
2656   datap = GOMP_single_copy_start ();
2657   if (datap == NULL)
2658     @{
2659       body;
2660       data.x = x;
2661       GOMP_single_copy_end (&data);
2662     @}
2663   else
2664     x = datap->x;
2665   GOMP_barrier ();
2666 @end smallexample
2667
2668
2669
2670 @node Implementing OpenACC's PARALLEL construct
2671 @section Implementing OpenACC's PARALLEL construct
2672
2673 @smallexample
2674   void GOACC_parallel ()
2675 @end smallexample
2676
2677
2678
2679 @c ---------------------------------------------------------------------
2680 @c Reporting Bugs
2681 @c ---------------------------------------------------------------------
2682
2683 @node Reporting Bugs
2684 @chapter Reporting Bugs
2685
2686 Bugs in the GNU OpenACC or OpenMP implementation should be reported via
2687 @uref{http://gcc.gnu.org/bugzilla/, Bugzilla}.  For OpenMP cases, please add
2688 "openmp" to the keywords field in the bug report.
2689
2690
2691
2692 @c ---------------------------------------------------------------------
2693 @c GNU General Public License
2694 @c ---------------------------------------------------------------------
2695
2696 @include gpl_v3.texi
2697
2698
2699
2700 @c ---------------------------------------------------------------------
2701 @c GNU Free Documentation License
2702 @c ---------------------------------------------------------------------
2703
2704 @include fdl.texi
2705
2706
2707
2708 @c ---------------------------------------------------------------------
2709 @c Funding Free Software
2710 @c ---------------------------------------------------------------------
2711
2712 @include funding.texi
2713
2714 @c ---------------------------------------------------------------------
2715 @c Index
2716 @c ---------------------------------------------------------------------
2717
2718 @node Library Index
2719 @unnumbered Library Index
2720
2721 @printindex cp
2722
2723 @bye