omp_get_active_level
– Number of parallel regions
omp_get_ancestor_thread_num
– Ancestor thread ID
omp_get_cancellation
– Whether cancellation support is enabled
omp_get_default_device
– Get the default device for target regions
omp_get_dynamic
– Dynamic teams setting
omp_get_level
– Obtain the current nesting level
omp_get_max_active_levels
– Maximum number of active regions
omp_get_max_task_priority
– Maximum priority value
omp_get_max_threads
– Maximum number of threads of parallel region
omp_get_nested
– Nested parallel regions
omp_get_num_devices
– Number of target devices
omp_get_num_procs
– Number of processors online
omp_get_num_teams
– Number of teams
omp_get_num_threads
– Size of the active team
omp_get_proc_bind
– Whether theads may be moved between CPUs
omp_get_schedule
– Obtain the runtime scheduling method
omp_get_team_num
– Get team number
omp_get_team_size
– Number of threads in a team
omp_get_thread_limit
– Maximum number of threads
omp_get_thread_num
– Current thread ID
omp_in_parallel
– Whether a parallel region is active
omp_in_final
– Whether in final or included task region
omp_is_initial_device
– Whether executing on the host device
omp_set_default_device
– Set the default device for target regions
omp_set_dynamic
– Enable/disable dynamic teams
omp_set_max_active_levels
– Limits the number of active parallel regions
omp_set_nested
– Enable/disable nested parallel regions
omp_set_num_threads
– Set upper team size limit
omp_set_schedule
– Set the runtime scheduling method
omp_init_lock
– Initialize simple lock
omp_set_lock
– Wait for and set simple lock
omp_test_lock
– Test and set simple lock if available
omp_unset_lock
– Unset simple lock
omp_destroy_lock
– Destroy simple lock
omp_init_nest_lock
– Initialize nested lock
omp_set_nest_lock
– Wait for and set nested lock
omp_test_nest_lock
– Test and set nested lock if available
omp_unset_nest_lock
– Unset nested lock
omp_destroy_nest_lock
– Destroy nested lock
omp_get_wtick
– Get timer precision
omp_get_wtime
– Elapsed wall clock time
acc_get_num_devices
– Get number of devices for given device type
acc_set_device_type
– Set type of device accelerator to use.
acc_get_device_type
– Get type of device accelerator to be used.
acc_set_device_num
– Set device number to use.
acc_get_device_num
– Get device number to be used.
acc_async_test
– Test for completion of a specific asynchronous operation.
acc_async_test_all
– Tests for completion of all asynchronous operations.
acc_wait
– Wait for completion of a specific asynchronous operation.
acc_wait_all
– Waits for completion of all asynchronous operations.
acc_wait_all_async
– Wait for completion of all asynchronous operations.
acc_wait_async
– Wait for completion of asynchronous operations.
acc_init
– Initialize runtime for a specific device type.
acc_shutdown
– Shuts down the runtime for a specific device type.
acc_on_device
– Whether executing on a particular device
acc_malloc
– Allocate device memory.
acc_free
– Free device memory.
acc_copyin
– Allocate device memory and copy host memory to it.
acc_present_or_copyin
– If the data is not present on the device, allocate device memory and copy from host memory.
acc_create
– Allocate device memory and map it to host memory.
acc_present_or_create
– If the data is not present on the device, allocate device memory and map it to host memory.
acc_copyout
– Copy device memory to host memory.
acc_delete
– Free device memory.
acc_update_device
– Update device memory from mapped host memory.
acc_update_self
– Update host memory from mapped device memory.
acc_map_data
– Map previously allocated device memory to host memory.
acc_unmap_data
– Unmap device memory from host memory.
acc_deviceptr
– Get device pointer associated with specific host address.
acc_hostptr
– Get host pointer associated with specific device address.
acc_is_present
– Indicate whether host variable / array is present on device.
acc_memcpy_to_device
– Copy host memory to device memory.
acc_memcpy_from_device
– Copy device memory to host memory.
acc_get_current_cuda_device
– Get CUDA device handle.
acc_get_current_cuda_context
– Get CUDA context handle.
acc_get_cuda_stream
– Get CUDA stream handle.
acc_set_cuda_stream
– Set CUDA stream handle.
This manual documents the usage of libgomp, the GNU Offloading and Multi Processing Runtime Library. This includes the GNU implementation of the OpenMP Application Programming Interface (API) for multi-platform shared-memory parallel programming in C/C++ and Fortran, and the GNU implementation of the OpenACC Application Programming Interface (API) for offloading of code to accelerator devices in C/C++ and Fortran.
Originally, libgomp implemented the GNU OpenMP Runtime Library. Based on this, support for OpenACC and offloading (both OpenACC and OpenMP 4's target construct) has been added later on, and the library's name changed to GNU Offloading and Multi Processing Runtime Library.