gko::OmpExecutor#

Multi-threaded host executor backed by OpenMP. Kernels fan out across the OpenMP thread pool; thread count, scheduling, and affinity are controlled by the usual OpenMP environment variables (OMP_NUM_THREADS, OMP_PROC_BIND, OMP_PLACES). Memory comes from the system allocator — on NUMA hosts, first-touch placement determines the locality, so allocate close to where you’ll use the data.

The natural CPU production executor. Pair with hardware-locality–aware matrix orderings (e.g. RCM) and a block-Jacobi or IC/ILU preconditioner for common iterative-solver workloads.

class OmpExecutor #

Inherits from

This is the Executor subclass which represents the OpenMP device (typically CPU).

Subclassed by

Public Functions

virtual std::shared_ptr<Executor> get_master() noexcept override#

Returns the master OmpExecutor of this Executor.

Returns:

the master OmpExecutor of this Executor.

virtual std::shared_ptr<const Executor> get_master(
) const noexcept override#

Returns the master OmpExecutor of this Executor.

Returns:

the master OmpExecutor of this Executor.

virtual void synchronize() const override#

Synchronize the operations launched on the executor with its master.

virtual std::string get_description() const override#
Returns:

a textual representation of the executor and its device.

virtual void run(const Operation &op) const = 0#

Runs the specified Operation using this Executor.

Parameters:

op – the operation to run

template<typename ClosureOmp, typename ClosureCuda, typename ClosureHip, typename ClosureDpcpp>
inline void run(
const ClosureOmp &op_omp,
const ClosureCuda &op_cuda,
const ClosureHip &op_hip,
const ClosureDpcpp &op_dpcpp,
) const#

Runs one of the passed in functors, depending on the Executor type.

Template Parameters:
  • ClosureOmp – type of op_omp

  • ClosureCuda – type of op_cuda

  • ClosureHip – type of op_hip

  • ClosureDpcpp – type of op_dpcpp

Parameters:
template<typename ClosureReference, typename ClosureOmp, typename ClosureCuda, typename ClosureHip, typename ClosureDpcpp>
inline void run(
std::string name,
const ClosureReference &op_ref,
const ClosureOmp &op_omp,
const ClosureCuda &op_cuda,
const ClosureHip &op_hip,
const ClosureDpcpp &op_dpcpp,
) const#

Runs one of the passed in functors, depending on the Executor type.

Template Parameters:
  • ClosureReference – type of op_ref

  • ClosureOmp – type of op_omp

  • ClosureCuda – type of op_cuda

  • ClosureHip – type of op_hip

  • ClosureDpcpp – type of op_dpcpp

Parameters:

Public Static Functions

static inline std::shared_ptr<OmpExecutor> create(
std::shared_ptr<CpuAllocatorBase> alloc = std::make_shared<CpuAllocator>(),
)#

Creates a new OmpExecutor.