gko::experimental::distributed::
RowGatherer#
Communication primitive that fetches remote rows of a distributed
vector — the workhorse behind the off-diagonal SpMV in
distributed::Matrix. Caches the send and receive index lists so
repeated halo exchanges with the same partition reuse the same
schedule and only the payload moves.
-
template<typename LocalIndexType = int32>
class RowGatherer # Inherits from
public gko::EnablePolymorphicObject<RowGatherer<int32>>
public gko::EnablePolymorphicAssignment<RowGatherer<int32>>
public gko::experimental::distributed::DistributedBase
The distributed::RowGatherer gathers the rows of distributed::Vector that are located on other processes.
Example usage:
auto coll_comm = std::make_shared<mpi::neighborhood_communicator>(comm, imap); auto rg = distributed::RowGatherer<int32>::create(exec, coll_comm, imap); auto b = distributed::Vector<double>::create(...); auto x = matrix::Dense<double>::create(...); auto req = rg->apply_async(b, x); // users can do some computation that doesn't modify b, or access x req.wait(); // x now contains the gathered rows of b
Note
The output vector for the apply_async functions must use an executor that is compatible with the MPI implementation. In particular, if the MPI implementation is not GPU aware, then the output vector must use a CPU executor. Otherwise, an exception will be thrown.
- Template Parameters:
LocalIndexType – the index type for the stored indices
Public Functions
- mpi::request apply_async( ) const#
Asynchronous version of LinOp::apply.
Warning
Only one mpi::request can be active at any given time. Calling this function again without waiting on the previous mpi::request will lead to undefined behavior.
- Parameters:
b – the input distributed::Vector.
x – the output matrix::Dense with the rows gathered from b. Its executor has to be compatible with the MPI implementation, see the class documentation.
- Returns:
a mpi::request for this task. The task is guaranteed to be completed only after
.wait()has been called on it.
- mpi::request apply_async( ) const#
Asynchronous version of LinOp::apply.
Warning
Calling this multiple times with the same workspace and without waiting on each previous request will lead to incorrect data transfers.
- Parameters:
b – the input distributed::Vector.
x – the output matrix::Dense with the rows gathered from b. Its executor has to be compatible with the MPI implementation, see the class documentation.
workspace – a workspace to store temporary data for the operation. This might not be modified before the request is waited on.
- Returns:
a mpi::request for this task. The task is guaranteed to be completed only after
.wait()has been called on it.
-
dim<2> get_size() const#
Returns the size of the row gatherer.
- std::shared_ptr<const mpi::CollectiveCommunicator> get_collective_communicator(
Get the used collective communicator.
-
const LocalIndexType *get_const_send_idxs() const#
Read access to the (local) rows indices
-
size_type get_num_send_idxs() const#
Returns the number of (local) row indices.
Public Static Functions
- std::shared_ptr<const Executor> exec,
- std::shared_ptr<const mpi::CollectiveCommunicator> coll_comm,
- const index_map<LocalIndexType, GlobalIndexType> &imap,
Creates a distributed::RowGatherer from a given collective communicator and index map.
@TODO: using a segmented array instead of the imap would probably be more general
Note
The coll_comm and imap have to be compatible. The coll_comm must send and recv exactly as many rows as the imap defines.
Note
This is a collective operation, all participating processes have to execute this operation.
- Template Parameters:
GlobalIndexType – the global index type of the index map
- Parameters:
exec – the executor
coll_comm – the collective communicator
imap – the index map defining which rows to gather
- Returns:
a shared_ptr to the created distributed::RowGatherer