3.7. Occupancy
This section describes the occupancy calculation functions of the CUDA runtime application programming interface.
Besides the occupancy calculator function (cudaOccupancyMaxActiveBlocksPerMultiprocessor), there are also C++ only occupancy-based launch configuration functions documented in C++ API Routines module.
See cudaOccupancyMaxPotentialBlockSize ( C++ API) and cudaOccupancyMaxPotentialBlockSizeVariableSMem ( C++ API)
Functions
- cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor ( int* numBlocks, const void* func, int blockSize, size_t dynamicSMemSize )
- Returns occupancy for a device function.
Functions
- cudaError_t cudaOccupancyMaxActiveBlocksPerMultiprocessor ( int* numBlocks, const void* func, int blockSize, size_t dynamicSMemSize )
-
Returns occupancy for a device function.
Parameters
- numBlocks
- - Returned occupancy
- func
- - Kernel function for which occupancy is calulated
- blockSize
- - Block size the kernel is intended to be launched with
- dynamicSMemSize
- - Per-block dynamic shared memory usage intended, in bytes
Returns
cudaSuccess, cudaErrorCudartUnloading, cudaErrorInitializationError, cudaErrorInvalidDevice, cudaErrorInvalidDeviceFunction, cudaErrorInvalidValue, cudaErrorUnknown,
Description
Returns in *numBlocks the maximum number of active blocks per streaming multiprocessor for the device function.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cudaOccupancyMaxPotentialBlockSize, cudaOccupancyMaxPotentialBlockSize ( C++ API), cudaOccupancyMaxPotentialBlockSizeVariableSMem ( C++ API)