gko::experimental::distributed::Matrix#

Distributed sparse matrix. Stores the rank’s owned rows as two CSR blocks: a diagonal block holding the columns the rank also owns, and an off-diagonal block holding the remote columns (compressed so only the columns actually referenced are stored). apply runs the diagonal SpMV locally while a RowGatherer exchanges the needed remote vector entries for the off-diagonal SpMV.

template<typename ValueType = default_precision, typename LocalIndexType = int32, typename GlobalIndexType = int64>
class Matrix #

Inherits from

  • public gko::EnableLinOp<Matrix<default_precision, int32, int64>>

  • public ConvertibleTo<Matrix<next_precision<default_precision>, int32, int64>>

  • public ConvertibleTo<Matrix<next_precision<default_precision, 2>, int32, int64>>

  • public ConvertibleTo<Matrix<next_precision<default_precision, 3>, int32, int64>>

  • public gko::experimental::distributed::DistributedBase

The Matrix class defines a (MPI-)distributed matrix.

The matrix is stored in a row-wise distributed format. Each process owns a specific set of rows, where the assignment of rows is defined by a row Partition. The following depicts the distribution of global rows according to their assigned part-id (which will usually be the owning process id):

Part-Id  Global Rows                   Part-Id  Local Rows
0        | .. 1  2  .. .. .. |         0        | .. 1  2  .. .. .. |
1        | 3  4  .. .. .. .. |                  | 13 .. .. .. 14 .. |
2        | .. 5  6  ..  7 .. |  ---->  1        | 3  4  .. .. .. .. |
2        | .. .. .. 8  ..  9 |  ---->           | .. .. .. 10 11 12 |
1        | .. .. .. 10 11 12 |         2        | .. 5  6  ..  7 .. |
0        | 13 .. .. .. 14 .. |                  | .. .. .. 8  ..  9 |
The local rows are further split into two matrices on each process. One matrix, called diag, contains only entries from columns that are also owned by the process, while the other one, called off_diag, contains entries from columns that are not owned by the process. The off-diagonal matrix is stored in a compressed format, where empty columns are discarded and the remaining columns are renumbered. This splitting is depicted in the following:
Part-Id  Global                            Diag       Off-Diag
0        | .. 1  ! 2  .. ! .. .. |         | .. 1  |  | 2  |
0        | 3  4  ! .. .. ! .. .. |         | 3  4  |  | .. |
         |-----------------------|
1        | .. 5  ! 6  .. ! 7  .. |  ---->  | 6  .. |  | 5  7  .. |
1        | .. .. ! .. 8  ! ..  9 |  ---->  | 8  .. |  | .. .. 9  |
         |-----------------------|
2        | .. .. ! .. 10 ! 11 12 |         | 11 12 |  | .. 10 |
2        | 13 .. ! .. .. ! 14 .. |         | 14 .. |  | 13 .. |
This uses the same ownership of the columns as for the rows. Additionally, the ownership of the columns may be explicitly defined with an second column partition. If that is not provided, the same row partition will be used for the columns. Using a column partition also allows to create non-square matrices, like the one below:
Part-Id  Global                  Diag       Off-Diag
P_R/P_C    2  2  0  1
0        | .. 1  2  .. |         | 2  |     | 1  .. |
0        | 3  4  .. .. |         | .. |     | 3  4  |
         |-------------|
1        | .. 5  6  .. |  ---->  | .. |     | 6  5  |
1        | .. .. .. 8  |  ---->  | 8  |     | .. .. |
         |-------------|
2        | .. .. .. 10 |         | .. .. |  | 10 |
2        | 13 .. .. .. |         | 13 .. |  | .. |
Here P_R denotes the row partition and P_C denotes the column partition.

The Matrix should be filled using the read_distributed method, e.g.

auto part = Partition<...>::build_from_mapping(...);
auto mat = Matrix<...>::create(exec, comm);
mat->read_distributed(matrix_data, part);
or if different partitions for the rows and columns are used:
auto row_part = Partition<...>::build_from_mapping(...);
auto col_part = Partition<...>::build_from_mapping(...);
auto mat = Matrix<...>::create(exec, comm);
mat->read_distributed(matrix_data, row_part, col_part);
This will set the dimensions of the global and local matrices automatically by deducing the sizes from the partitions.

By default the Matrix type uses Csr for both stored matrices. It is possible to explicitly change the datatype for the stored matrices, with the constraint that the new type should implement the LinOp and ReadableFromMatrixData interface. The type can be set by:

auto mat = Matrix<ValueType, LocalIndexType[, ...]>::create(
  exec, comm,
  Ell<ValueType, LocalIndexType>::create(exec).get(),
  Coo<ValueType, LocalIndexType>::create(exec).get());
Alternatively, the helper function with_matrix_type can be used:
auto mat = Matrix<ValueType, LocalIndexType>::create(
  exec, comm,
  with_matrix_type<Ell>(),
  with_matrix_type<Coo>());
The Matrix LinOp supports the following operations:
experimental::distributed::Matrix *A;       // distributed matrix
experimental::distributed::Vector *b, *x;   // distributed multi-vectors
matrix::Dense *alpha, *beta;  // scalars of dimension 1x1

// Applying to distributed multi-vectors computes an SpMV/SpMM product
A->apply(b, x)              // x = A*b
A->apply(alpha, b, beta, x) // x = alpha*A*b + beta*x

See also

with_matrix_type

Template Parameters:
  • ValueType – The underlying value type.

  • LocalIndexType – The index type used by the local matrices.

  • GlobalIndexType – The type for global indices.

Public Functions

void read_distributed(
const device_matrix_data<value_type, global_index_type> &data,
std::shared_ptr<const Partition<local_index_type, global_index_type>> partition,
assembly_mode assembly_type = assembly_mode::local_only,
)#

Reads a square matrix from the device_matrix_data structure and a global partition.

The global size of the final matrix is inferred from the size of the partition. Both the number of rows and columns of the device_matrix_data are ignored.

Note

The matrix data can contain entries for rows other than those owned by the process. Entries for those rows are discarded.

Parameters:
  • data – The device_matrix_data structure.

  • partition – The global row and column partition.

  • x – The mode of assembly.

Returns:

the index_map induced by the partitions and the matrix structure

void read_distributed(
const matrix_data<value_type, global_index_type> &data,
std::shared_ptr<const Partition<local_index_type, global_index_type>> partition,
assembly_mode assembly_type = assembly_mode::local_only,
)#

Reads a square matrix from the matrix_data structure and a global partition.

See also

read_distributed

Note

For efficiency it is advised to use the device_matrix_data overload.

void read_distributed(
const device_matrix_data<value_type, global_index_type> &data,
std::shared_ptr<const Partition<local_index_type, global_index_type>> row_partition,
std::shared_ptr<const Partition<local_index_type, global_index_type>> col_partition,
assembly_mode assembly_type = assembly_mode::local_only,
)#

Reads a matrix from the device_matrix_data structure, a global row partition, and a global column partition.

The global size of the final matrix is inferred from the size of the row partition and the size of the column partition. Both the number of rows and columns of the device_matrix_data are ignored.

Note

The matrix data can contain entries for rows other than those owned by the process. Entries for those rows are discarded.

Parameters:
  • data – The device_matrix_data structure.

  • row_partition – The global row partition.

  • col_partition – The global col partition.

  • assembly_type – The mode of assembly.

Returns:

the index_map induced by the partitions and the matrix structure

void read_distributed(
const matrix_data<value_type, global_index_type> &data,
std::shared_ptr<const Partition<local_index_type, global_index_type>> row_partition,
std::shared_ptr<const Partition<local_index_type, global_index_type>> col_partition,
assembly_mode assembly_type = assembly_mode::local_only,
)#

Reads a matrix from the matrix_data structure, a global row partition, and a global column partition.

See also

read_distributed

Note

For efficiency it is advised to use the device_matrix_data overload.

inline std::shared_ptr<const LinOp> get_diag_matrix() const#

Get read access to the stored diagonal matrix block.

The diagonal block contains entries from columns that are owned by this process.

Returns:

Shared pointer to the stored diagonal matrix block

inline std::shared_ptr<const LinOp> get_off_diag_matrix() const#

Get read access to the stored off-diagonal matrix block.

The off-diagonal block contains entries from columns that are not owned by this process.

Returns:

Shared pointer to the stored off-diagonal matrix block

inline std::shared_ptr<const LinOp> get_local_matrix() const#

Deprecated:

Use get_diag_matrix() instead.

inline std::shared_ptr<const LinOp> get_non_local_matrix() const#

Deprecated:

Use get_off_diag_matrix() instead.

Matrix(const Matrix &other)#

Copy constructs a Matrix.

Parameters:

otherMatrix to copy from.

Matrix(Matrix &&other) noexcept#

Move constructs a Matrix.

Parameters:

otherMatrix to move from.

Matrix &operator=(const Matrix &other)#

Copy assigns a Matrix.

Parameters:

otherMatrix to copy from, has to have a communicator of the same size as this.

Returns:

this.

Matrix &operator=(Matrix &&other)#

Move assigns a Matrix.

Parameters:

otherMatrix to move from, has to have a communicator of the same size as this.

Returns:

this.

void col_scale(ptr_param<const global_vector_type> scaling_factors)#

Scales the columns of the matrix by the respective entries of the vector. The vector’s row partition has to be the same as the matrix’s column partition. The scaling is done in-place.

Parameters:

scaling_factors – The vector containing the scaling factors.

void row_scale(ptr_param<const global_vector_type> scaling_factors)#

Scales the rows of the matrix by the respective entries of the vector. The vector and the matrix have to have the same row partition. The scaling is done in-place.

Parameters:

scaling_factors – The vector containing the scaling factors.

Public Static Functions

static std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
)#

Creates an empty distributed matrix.

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix. The default is the MPI_COMM_WORLD.

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
std::shared_ptr<const RowGatherer<LocalIndexType>> row_gatherer_template,
)#

Creates an empty distributed matrix with a specified implementation of the row gather operation.

Parameters:
  • execExecutor associated with this matrix.

  • row_gatherer_template – A template for the used row gather operation. This is only used to create a new row gatherer during the read_distributed.

Returns:

A smart pointer to the newly created matrix.

template<typename MatrixType, typename = std::enable_if_t<gko::detail::is_matrix_type_builder<MatrixType, ValueType, LocalIndexType>::value>>
static inline std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
MatrixType matrix_template,
)#

Creates an empty distributed matrix with specified type for local matrices.

See also

with_matrix_type

Note

This is a convenience wrapper around the create overload that takes an already-constructed LinOp template; the matrix_template argument here is materialised internally.

Template Parameters:

MatrixType – A type that has a create<ValueType, IndexType>(exec) function to create a smart pointer of a type derived from LinOp and ReadableFromMatrixData.

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix.

  • matrix_template – the local matrices will be constructed with the same type as create returns. It should be the return value of make_matrix_template.

Returns:

A smart pointer to the newly created matrix.

template<typename DiagMatrixType, typename OffDiagMatrixType, typename = std::enable_if_t<gko::detail::is_matrix_type_builder<DiagMatrixType, ValueType, LocalIndexType>::value && gko::detail::is_matrix_type_builder<OffDiagMatrixType, ValueType, LocalIndexType>::value>>
static inline std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
DiagMatrixType diag_matrix_template,
OffDiagMatrixType off_diag_matrix_template,
)#

Creates an empty distributed matrix with specified types for the diagonal matrix and the off-diagonal matrix.

See also

with_matrix_type

Note

This is a convenience wrapper around the create overload that takes already-constructed LinOp templates for the diagonal and off-diagonal blocks; the two template arguments here are materialised internally.

Template Parameters:
  • DiagMatrixType – A type that has a create<ValueType, IndexType>(exec) function to create a smart pointer of a type derived from LinOp and ReadableFromMatrixData.

  • OffDiagMatrixType – A (possible different) type with the same constraints as DiagMatrixType.

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix.

  • diag_matrix_template – the diagonal matrix will be constructed with the same type as create returns. It should be the return value of make_matrix_template.

  • off_diag_matrix_template – the off-diagonal matrix will be constructed with the same type as create returns. It should be the return value of make_matrix_template.

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
ptr_param<const LinOp> matrix_template,
)#

Creates an empty distributed matrix with specified type for local matrices.

Note

It internally clones the passed in matrix_template. Therefore, the LinOp should be empty.

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix.

  • matrix_template – the local matrices will be constructed with the same runtime type.

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
ptr_param<const LinOp> diag_matrix_template,
ptr_param<const LinOp> off_diag_matrix_template,
)#

Creates an empty distributed matrix with specified types for the diagonal matrix and the off-diagonal matrix.

Note

It internally clones the passed in diag_matrix_template and off_diag_matrix_template. Therefore, those LinOps should be empty.

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix.

  • diag_matrix_template – the diagonal matrix will be constructed with the same runtime type.

  • off_diag_matrix_template – the off-diagonal matrix will be constructed with the same runtime type.

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
dim<2> size,
std::shared_ptr<LinOp> diag_linop,
)#

Creates a diag-only distributed matrix with existent LinOp

Note

It use the input to build up the distributed matrix

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix.

  • size – the global size

  • diag_linop – the diagonal block linop

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
dim<2> size,
std::shared_ptr<LinOp> diag_linop,
std::shared_ptr<LinOp> off_diag_linop,
std::vector<comm_index_type> recv_sizes,
std::vector<comm_index_type> recv_offsets,
array<local_index_type> recv_gather_idxs,
)#

Creates distributed matrix with existent diagonal and off-diagonal LinOp and the corresponding mapping to collect the off-diagonal data from the other ranks.

Note

It use the input to build up the distributed matrix

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix.

  • size – the global size

  • diag_linop – the diagonal block linop

  • off_diag_linop – the off-diagonal block linop

  • recv_sizes – the size of off-diagonal receiver

  • recv_offsets – the offset of off-diagonal receiver

  • recv_gather_idxs – the gathering index of off-diagonal receiver

Returns:

A smart pointer to the newly created matrix.

static std::unique_ptr<Matrix> create(
std::shared_ptr<const Executor> exec,
mpi::communicator comm,
index_map<local_index_type, global_index_type> imap,
std::shared_ptr<LinOp> diag_linop,
std::shared_ptr<LinOp> off_diag_linop,
)#

Creates distributed matrix with existent diagonal and off-diagonal LinOp and the corresponding mapping to collect the off-diagonal data from the other ranks.

Parameters:
  • execExecutor associated with this matrix.

  • comm – Communicator associated with this matrix.

  • imap – The index map to define the communication pattern

  • diag_linop – the diagonal block linop

  • off_diag_linop – the off-diagonal block linop

Returns:

A smart pointer to the newly created matrix.

gko::with_matrix_type#

Helper for selecting the local-block storage type used by a distributed::Matrix without having to commit to the value and index types up front. Matrix::create accepts one or two of these selectors to override the default Csr storage of the diagonal and off-diagonal blocks:

auto mat = Matrix<ValueType, LocalIndexType>::create(
    exec, comm,
    gko::with_matrix_type<gko::matrix::Ell>(),   // local diagonal block
    gko::with_matrix_type<gko::matrix::Coo>());  // off-diagonal block

with_matrix_type<MatrixType>(args...) returns a deferred-creation object; the distributed Matrix constructor invokes its create<value_type, index_type>(exec) once the value and index types are known, forwarding args... to MatrixType::create. Any type that satisfies LinOp and ReadableFromMatrixData is acceptable.

template<template<typename, typename> class MatrixType, typename ...Args>
auto gko::with_matrix_type(
Args&&... create_args,
)#

This function returns a type that delays a call to MatrixType::create.

It can be used to set the used value and index type, as well as the executor at a later stage.

For example, the following code creates first a temporary object, which is then used later to construct an operator of the previously defined base type:

auto type = gko::with_matrix_type<gko::matrix::Csr>();
...
std::unique_ptr<LinOp> concrete_op
if(flag1){
  concrete_op = type.template create<double, int>(exec);
} else {
  concrete_op = type.template create<float, int>(exec);
}

Note

This is mainly a helper function to specify the local matrix type for a gko::experimental::distributed::Matrix more easily.

Template Parameters:
  • MatrixType – A template type that accepts two types, the first one will be set to the value type, the second one to the index type.

  • Args – Types of the arguments passed to MatrixType::create.

Parameters:

create_args – arguments that will be forwarded to MatrixType::create

Returns:

A type with a function create<value_type, index_type>(executor).